Prev | Next | Contents


Part A: STRUCTURAL PRESENTATION.

   Throughout this documentation, written text may have several meanings which must be carefully separated. In order to do so, some graphical conventions will be used:

   The vertical bar ( | ) will denote a choice to be made among a finite number of options, like in for|if|while.

1 Types.

   The type of an atomic entity is defined by a basic type and, optionally, by a validation function. Nonatomic types are defined by the way other types are grouped together.

1.1 Atomic types.

   The basic built-in atomic types do not need declaration. They are:

   Note that no reference to actual size is being made. The programmer must know that he can use 32-, 64- or 128-bit integers, but the source needs not to.

1.2 Nonatomic types.

   Nonatomic types group together other types in various ways. The supported layouts are:

   The type string is a shorthand for sequence of char. The type fixedstring(number) is a shorthand for array(number) of char.

1.3 Added validations and user-defined types.

   A user-defined type is a refinement of one of the above types, defined through a validation function. Such a function is defined using the keyword type or reftype as a routine type. It may have side effects, and must return an atomic value.

A variable x has the user-defined type mytype if:

   Please note that variables carry user-defined type information, but values don't. This subtle difference will surface in 5.3.2 (assigning elements to nonatoms and tracking their types).

1.4 Context types.

   The type of a member of a record may be specified using the context information of the other members of the record.

   To do this, the member must be declared as "_" in the record declaration. Later, the type declaration has the following form:

[global ]type|reftype record_name.member_name(parameters)
...
end type|reftype

parameters has one or two parameters (see 8.1 below).
To refer to members of the structure, use the syntax .membername.

Example:

record stringWithIndex(

string s,_ index)

--type of this member will be defined later as StringWithIndex.index

end record

...some possibly unrelated code ...

type stringWithIndex.index(integer i)

return i>=0 and i<=length(.s)

end type

   A reftype might have been used as well.

1.5 Type checking.

1.5.1 The general case.

   When a variable is going to be assigned a value, it may be checked that the variable can hold such a value. To determine this, the type or reftype function attached to the variable is called, and the check succeeds if it returns True. A failed type check causes an exception The handler for this exception may take corrective action or juts let the running program abort.

   When the "with typecheck" directive is in force, this process is performed on every assignment. As this impacts performance, systematic type checking may be turned off. For obvious reasons, type checking will still take place at a few implementation specific places.

1.5.2 Forced type checking.

   When systematic type checking is turned off, you may wish to keep some control over which type checks are performed, because you need the checking - for instance, when the type functions have side effects.

   To this end, you can add the keyword check as a prefix to a (ref)type. Types thus earmarked are always checked, even when type checking is off.

   You can further restrict checking to some variables by creating two twin types, one without the check prefix and one wrapping the latter, but with the check prefix.

   You can further fine tune the checking by having user defined types that return True (check passed) unless some condition is met, in which case some real action takes place instead.

1.6 Type aliasing.

   You can give alternate names to types. This may enhance code readibility, as the same type may have several interpretations in the same program. But it is mostly needed to call type checking functions for types with a compound name, like sequence of integer. You can do this using the statement:

type | reftype altname is aliased

   This creates an alias and a function. The alias altname can be used wherever aliased could be used. The function altname() function checks the appartenance to the type aliased; it is a type or a reftype according to the statement used.

2 Basic tokens.

2.1 Identifiers.

   An identifier is a string of consecutive letters, digits and underscores, strting with a letter. What a letter precisely means depends on implementation, but always includes the ranges 'a'-'z' and 'A'-'Z'.

Lower and upper case letters are different; contrary to some other languages, OpenEuphoria is case sensitive, which means that the exact spelling of an identifier is taken into account.

So, "var" and VaR" are different, valid identifiers, while "_top" or "2read" are not valid. "Ý3z2" may or may not be valid, according to implementation specific rules.

2.2 Quoted characters.

   A quoted character is a single (double-byte) character inside simple quotes, or an escape sequence, also inside simple quotes. Supported escape sequences are:

\' simple quote
\" double quote
\n newline (ASCII 10) (you cannot use \N)
\r carriage return (ASCII 13)
\t tab (ASCII 9)
\\ backslash
\(number) the character with ASCII/Unicode code number.

2.3 Text.

   Text is anything between double quotes. It is not processed at all, except for escape sequence resolution.

   Verbatim text is enclosed between matching groups of three consecutive double quotes ("""). In verbatim mode, any character, including whitespace, is considered as part of the string. There is no special meaning for the backslash character, and there is no escape sequence processing as a result.

   There is also an intermediate long text mode. Strings in this mode are enclosed between matching $" and "$. line_end characters '\r' and '\n' are ignored in long text mode, but other characters, including escape sequences, are treated as in normal text mode.

Example 1:

"This is a very long string which "&
"has to be broken for readibility reasons."

could be written as:

$"This is a very long string which
"has to be broken for readibility reasons."$

Example 2:

"This is a very long string which \n"& "has to be broken for readibility reasons."

could be written as:

"""This is a very long string which
has to be broken for readibility reasons."""

2.4 Numerical items.

   They fall into three categories:

The number must be in decimal digits.

   Internally, OpenEuphoria performs as many automatic conversions as it can, taking advantage of available hardware, to minimize memory usage by numerical items, while retaining the precision of these numbers.

   Additionally, underscores may be freely used inside numbers to enhance readibility. An underscore is not a real digit since it must not start a number.

   Support for fractional numbers, which allow exact computations using the four elementary operations, is to be included.

2.5 Comments.

   Comments may appear at the end of any physical line of a source file. If the line was empty, it may start the line.

   A comment starts by the characters "--" and extends to the physical end of line (the next line_end).

   OpenEuphoria does not process comments in any way. The facility is provided in order to document your code so that others, or possibly yourself, find understanding the code a relatively easy task, so that it can be maintained or upgraded fairly easily. Time spent commenting code will often bring a large reward in terms of cuts in maintainance and debugging time, if nothing else.

   Precise, concise, relevant, useful commenting is an obscure art that may make the difference between ordinary and outstanding coders.


3 Operations.

   They are defined by the use of infix or prefix operators, as opposed to routine calls, which use prefix identifiers acting on a list of arguments enclosed between parentheses.

   Remember that an infix notation is one that goes in between its operands (like the usual multiplication), while a prefix notation appears before its operands.

3.1 Supported operators.

   They are:

+ addition of numbers
- substraction of numbers
* multiplication of numbers
/ division of numbers
& concatenation of sequences
&& bitwise and
|| bitwise or
^ binary inversion
~~ bitwise xor
<< binary left shift
>> binary right shift
>>> binary signed right shift

3.2 Extension to nonatomic types.

   If the left operand of one of the above operator is not atomic while the right operand is, the operation will be performed on each of the elements of the left operand. So, adding 1 to an array means adding 1 to each array element.

   If both operands are nonatomic and have the same length, the operation is performed on each pair of matching elements in turn. So, {3,5}+{2,-4}={3+2,5-4}.

   Contrary to Euphoria, this scheme does not automatically extend to logical operators. The "with seq_compat" directive turns the legacy behaviour on and off at will.

3.3 Precedence hierarchy.

   When more than two operators appear in a row, without parentheses separating them, there is a choice to be made: which operation to perform first? This is an important question, since the results may differ.

   There is a predefined set of rules to help OpenEuphoria interpreter make a reasonable guess. The rules may be overridden using parentheses to force another evaluation order.

   Here is the chart of operator precedence:

highest precedence: routine calls

  unary +/-
  bit-level operations
  * /
lowest precedence: &


   Routine calls are evaluated first, then parenthesized expressions, starting at the deepest nesting level of them. The order of evaluation of items of the same precedence is undefined.

   Thus, for instance, 3+2*4 is the same as 3+(2*4). To perform the addition first, code (3+2)*4.

   Also, if the function f sets x to 3 whatever its argument, x+f(x) is 3+3=6 regardless of what x is.

3.3 Formation of nonatomic objects.

   The construct {{expression}} creates an object of non atomic datatype whose first element is the first expression in the list and so on. {} denotes an empty nonatomic object.

   For this purpose, records are ordered in the way their elements were declared in the record definition of their type.

   "" is equivalent to an empty string.

3.4 Accessing elements of nonatomic objects.

   Single elements of nonatomic objects (or nonatoms in the sequel) are accessed using an index enclosed between square brackets, as in: ThisList[4]. Note that any nonatom, even the returned value from a function, may be indexed.

   Records have named parts, or members, which are used to access them using the syntax: record name.member name, like in: ThisCustomer.name .

   Since record fields are declared in an ordered way, records also support indexed accessing: the index "n" then refers to the n-th field in the declaring enumeration.

   Indexes may be negative, in which case the elements are counted backwards. So, ThisList[-1] is the last element of ThisList, ThisList[-2] the second last and so on.

   You can use floating point numbers as indexes. They are rounded to the next integer downward before any further processing. So, s[-0.3] is s[-1].

   0 is never a valid standalone index. See section 3.5 below for valid uses of 0 in index specifications.

   Indexes whose absolute value are greater than one plus the length of the container they index always cause an exception. 0 and +/-(length(container)+1) are only allowed when specifying an empty slice (see 3.5.2 below).

3.5 Staticly accessing parts of nonatomic objects.

3.5.1 Nonempty slices.

   Accessing several elements in a row is possible, and is done through slices. A slice is a comma separated list of indexes and ranges. A range is specified as lower..upper, where lower and upper are the lower and upper desired index values. Obviously, the latter is not less than the former, after conversion to positive standard indexes.

   So, the statement:

NewList=ThisList[3,1..-4,-2,3..6]

   generates a list formed of elements of ThisList, in the following way:

NewList[1] is ThisList[3]
NewList[2] is ThisList[1]
NewList[3] is ThisList[2]
...
NewList[-5] is ThisList[-2]
NewList[-4] is ThisList[3]
...
NewList[-1] is ThisList[6]

   Of course, if an element of a non-atom is a non-atom itself, several square bracketed index specifications may follow one another, like in: mymatrix[1..3][4]. This is a sequence of length three, exactly {mymatrix[1][4],mymatrix[2][4],mymatrix[3][4]}.

   For records, use names rather than indexes, even though they are just as valid ways to access record parts. For instance, the following

CustList[27..41][name,zipcode,nbOrders]

will generate a sequence of data extracted from a 15-element subsequence of CustList starting with the 27-th. We assumed that CustList is a sequence of Customers, which are records the declaration of which involves members named name, nbOrders and zipcode. The statement above generates a sequence since the type of all its elements are the same. Each of its element is a sequence (of object) of length 3, since it is quite likely formed by a string, another string and an integer.

   name has a rank in the enumeration of fields that build the Customer type. If that rank is 3, you could code

CustList[27..41][3,zipcode,nbOrders]

with exactly the same meaning as above. However, if that rank changes in future versions of your program, the "3" index will have to be changed to its new value, while the field name would remain the same. This is why using names is recommended over using indexes when possible.

   [..] and [] are shorthands for [1..-1]. [n..] is a shorthand for [n..-1]. The word end may be used as the last element of a sequence, synonym with -1 in this context only. The same is to be said of the $ sign.

3.5.2 Empty slices.

   You may specify empty slices when they are made of ranges the upper value of which is exactly one less than the lower value. One of the index values must be valid though.

   Thus, s[2..1], s[4..-5] and s[1..0] are all empty objects, assuming length(s)=7 so that -5 reads as 3. The last example is the only valid use of 0 in indexes; any other situation causes an exception. In the same vein, s[2..3,1..0] has length 2, while s[3..2,1..0] is empty.

    But s[13..12] causes an exception, since both index values are way out of range.

   end and $ are again synonyms for "the last element of", or -1.

3.6 Dynamically accessing parts of non-atoms.

   If s has nonatomic type and if t has the format described below, s[[t]] is a valid syntax for a part of s with variable index depth. This specially handy for tree management. s[[t]] is s[t[1]][t[2]]...[t[length(t)]].

   Each t[i] must be a sequence made of atoms and sequences of length 1 or 2. Atoms are converted to sequences of length 1, and both specify single indexes. Sequences of length 2 stand for slices in an obvious lower..upper way.

   For instance, assume t={2,{-1,{3,4}},{{1},4}}. Then

s[[t]]=s[2][-1,3..4][1..4]

   This sequence has three nonatomic elements, each of which is of length 4. Each of them consists of the 4 first elements of elements of s[2]. The first of these elements is the last of s[2]; the second and third are respectively third and fourth in s[2]. Note that the last {1} could be written 1 as well.

   The sequence t is said to be a subexpression representation for s[[t]] relative to s.

   Additionally, note that a list of indexes may come from a sequence through desequencing (see 5.5.2 below), so that s[#(t)] stands for s[t[1],t[2],...,t[-1] ].

3.7 Manipulating nonatoms.

   It is always bnecessary, once nonatoms are created and populated. While these manipulations can be done through a limited set of operations and routines stored in external files, this implies loss of performance and frequent reinventing of the wheel. For these reasons, OpenEuphoria provides quite a few built-in handling routines for nonatoms.

3.7.1 Getting information about nonatoms.

   Functions are provided in order to know how many, and which, elements are in the nonatom:

length(target)
is the number of elements of the nonatom target.
find(what,target)
returns 0 if what is not an element of target or of any (...(nonatomic element of)...) nonatomic element of target. Otherwise returns a positive integer. If target is an array or a sequence of items of the same type as what, this integer is the first element of target to equal what.
find_all(what,target)
returns the possibly empty sequence of all integers which are indexes of elements of target which equal what.
match(what,target)
returns the lowest integer such that some slice of target starting at that position equals what. If there is no such integer, and what is not the empty sequence, 0 is returned. If what is the empty sequence, -1 is returned if target is not empty, and -2 if it is.

If what is an atom, match returns as find would.

match_all(what,target)
returns the possibly empty list of all integers i such that some slice of target starting at i equals what. Always returns the empty sequence if what is the empty sequence.

3.7.2 Adding elements to sequences.

   Elements may be added to nonatoms as single objects or sequences, at any position in the sequence.

   Here is a list of available routines:

the & operator.
If s1 and s2 are nonatoms, s1 & s2 is a sequence of length the sum of the lengths of s1 and s2. The length(s1) first elements of s1 & s2 are those of s1, followed by those of s2.
append(s,x)
is a sequence obtained by adding x to s as its last element. Its length is length(s)+1 whatever x is.
prepend(s,x)
is a sequence obtained by adding x to s as its first element. Its length is length(s)+1 whatever x is.
insert(target,places,added)
is a sequence where the elements of nonatom added are inserted as single objects inside the nonatom target at the locations given by the nonatom places. The length of the returned sequence is the sum of the lengths of target and added.
places must be strictly increasing; strange, but sometimes desired effects may result otherwise. places may be an atom, in which case it is converted into a sequence of length 1 before further processing.
insert(target,places,added)
is a sequence where the elements of nonatom added are inserted as sequences inside the nonatom target at the locations given by the nonatom places. The length of the returned sequence is the sum of the length of target, the lengths of the nonatoms of added and the number of atoms in added.
places must be strictly increasing; strange, but sometimes desired effects may result otherwise. places may be an atom, in which case it is converted into a sequence of length 1 before further processing.

3.7.3 Removing elements from sequences.

   Three functions are provided:

3.7.3.1 The remove function.

    remove(target,places) returns the sequence target from which the elements whose index belongs to the sequence of integers places were removed, notionally starting from the last. places is assumed to be sorted in ascending order; strange, but sometimes desired results might happen otherwise.

3.7.3.2 The replace function.

    replace(target,places,items) returns a nonatom of the same type as target. It is obtained by removing from target the slices specified in places and replacing the carved out slices by elements of the sequence of nonatoms items, inserted as sequences. Each element of the sequence places is a pair of integers, the lower and upper bounds of each slice to process. items must have the same length as places, as each slice specified in places is replaced by the element at the same position in items.

   If places has the form {i1,i2}, where i1 and i2 are integers, this is converted to {{i1,i2}} first. If places is just an integer i, this is changed to {{i,i}}.

3.7.3.3 The fit procedure.

   The call fit(target,source,padding) causes source to be copied to target even though the lengths may not match. In tis case, an ordinary assignment would have raised an error.

   If length(target)<=length(source), only the portion of source that fits into target is copied, effectively discarding elements of source with higher indexes.

   Otherwise, if padding is a char, the elements of target in excess relative to source's length are replaced by that char. padding may have the special value _, in which case these elements of target remain unchanged.

3.7.4 Permutations on non-atoms.

   Nonatoms are orderes sets of elements; so, they can be reordered. As the number of permutations on a given number of symbols rapidly oncreases with that number, it is neither practical nor efficient to directly specify a permutation of a nonatom. However, the following functions cover the most frequent cases and can be combined into any sort of shuffling.

reverse(target)
returns the nonatom target with its order reversed: the first element becomes the last, the second element becomes the second last, and so on.
move(target,start,end,where)
moves target[start..end] to position where in target. An error will occur if the shifted slice extends past the end of the nonatom, ie if where+end-start-1 is greater than length(target).

4 Condition evaluation.

   Conditions are made of clauses linked together by logicals. A condition must evaluate to a boolean value of True or False. 0 stands for False, any other atom stands for True.

4.1 Truth tables for logicals.

   A truth table is a table assigning a boolean return value to any couple of booleans. To draw truth tables esaily, we'll represent True by T and False by F.

4.1.1 The "and" operator.

      ! F ! T !
   ---+---+---!
    F ! F ! F !  Read: "and" returns False, except when both arguments are True.
   ---+---+---!  In that case only, it returns true.
    T ! F ! T !
   -----------!

4.1.2 The "or" operator.

      ! F ! T !
   ---+---+---!
    F ! F ! T !  Read: "or" returns True, except when both arguments are False.
   ---+---+---!  In that case only, it returns False.
    T ! T ! T !
   -----------!

4.1.3 The "not" operator.

      !   !
   ---+---!
    F ! T !  Read: "not" returns True if its argument is False, and False
   ---+---!  otherwise.
    T ! F !
   -------!

4.1.4 The = operator.

      ! F ! T !
   ---+---+---!
    F ! T ! F !  Read: "=" returns True when its operands have the same boolean
   ---+---+---!  value; else it returns False. This is the truth table of the
    T ! F ! T !  "xor" logical operator, which is not ssupported for this reason.
   -----------!


4.2 Short-circuit evaluation.

   From close inspection of the tables above, it follows that you need not always compute both arguments of a logical to know its return value; computing the first one is often enough.

This saves useless instruction execution, and may greatly simplify programming. The short-circuit rules are:

   Note that short-circuit evaluation applies to any use of logicals. This is not true in Euphoria, where it only applies inside the conditions of if, elsif and while statements. The "with RDS" directive turns compatibility mode on and off in his respect as well.

4.3 Example code: finding an name in an address book.

   Assume Address is a record type that has a member called name, and that addrbook is a sequence of Address. Then

i=1
while i<=length(addrbook) and addrbook[i].name!="myname" do

i+=1

end while
if i>length(addrbook) then i=0 end if

will scan the address book for a record whose name member is equal to "myname". A value of 0 stands for name not found; else i holds the ordinal of the first occurrence of "myname" in a member of a record in addrbook.

   Without short-circuit evaluation, this code would fail if "myname" is not found, because addrbook[length(addrbook)+1] would be evaluated, causing an exception. In such a case, the code would be something like:

found=0
for i=1 to length(addrkook) do

if addrbook.name="myname" then
found=i exit
end if

end for

   So, an extra state variable is needed: even if i is available after the end for statement, a maximal value for i may mean that "myname" appeared as the last name or did not appear at all. The found variable is 0 on failure, and else means as above. And what if there was no exit statement?

4.4 Side effects.

   As routine calls are resolved first, they may affect the variables appearing in a condition.

   Further, it may be desirable to record the value of expressions that appeared inside conditions. Because of short-circuit evaluation capabilities of OpenEuphoria, it is not always possible to compute the expressions prior to the condition evaluation, as this may raise exceptions.

   To address this situation, you can embed assignments in conditions, using the := form of the assignment operator.

   So:

if f0(a)=x:=f(b) and b=y:=g(a) then ...

will result in the following:

The =f(a) assignment might have been taken out of the if statement, for better readibility.

thus taking into account the possible side effects of f and f0. It is not modified otherwise.

   An obvious use of this feature is to know why an if block was entered or not in the case of several clauses in the condition.


5 Variables.

   Variables are tags that identify data the program where they appear will act upon. These tags are general_identifiers.

5.1 Properties of a variable.

   A variable has a number of attrib=utes, or attached data that can be retrieved. They are called metadata, and are retrieved using the construct general_identifier@meta.

The available metadata are:

name
x@name is "x". Seems redundant, but see 5.4 below.
assigned
x@assigned is False if x never was assigned a value, True otherwise.
value
x@value holds the contents of x. Valid only if x@assigned is True.
size
the number of bytes x occupies in memory. This is mainly useful for interfacing with other languages.
type
the routine id of the type checking routine assigned to the variable when it was declared.
deftype
the routine id of the common type of all elements of a nonatom. Available for nonatoms only.
id
an integer you can use as an alternate way to access x (see 5.4.
scope
a value that tells in which part of the program the symbol is defined. See 5.2 below.
format
a default format used to display the value of the variable. See format string specification in the entry for printf() in part C.
decl_mode
this is True if the variable was declared using new_var(), and False otherwise.

readonly
roNo for variables, roYes for locked variables, roConst for constants.
types
available for nonatoms only. It is a sequence of integers the length of the nonatom. Each integer is the routine_id of the type function of the matching element.

   Only the value and format metadata can be directlly changed; the other attributes are read-only, or can be changed only through dedicated routines.

   A record of all metadata a single symbol has can be retrieved using the get_meta function. The record has the reserved type SystemVarMeta and has elements with the names and indexes as in the list above. The argument of get_meta is either a double quooted variable name, or an expression evaluating to the id of the variable the metadata of which are requested.

   Formats are specified like for printf() use. See the entry for this function in the alphabetical part C. The @format is used only if it has another value than "", which it has by default.

5.2 Scope of a variable.

   A program is made of a main file (the one you feed the interpreter with) and zero or more auxiliary files. Named scopes inside files may exist (see Chapter 6, "Included files and namespaces".). Both are referred to as abstract files.

A symbol can be visible:

   As a result, the scope metadata has three possible values: sGlobal, sPublic and sPrivate, respectively.

   Symbols that have a different scope coexist together. But, at any given time, only one of them is referred by the name they share. This symbol is said to shadow the others.

   Private symbols shadow public symbols, and public symbols shadow global symbols without namespace.

   Clash between two symbols sharing the same name and both visible at some point is an error condition, since the interpreter does not know which one the general_identifier designates. Obviously, the error occurs only when the ambiguous symbol is used.

   The word "symbol" is purposely used here instead of "variable", because the notions above also apply to routines (see Chapter 8 "Routines").

5.3 Declaring a variable.

5.3.1 Type of a variable.

   Types in OpenEuphoria describe logical properties of values a variable may hold. There are four ways to declare a variable, and all but one require an explicit typr:

   Only in the last case explicit typing is absent. But, from the values the three parameters of a for loop have, an integer or atom type is guaranteed.

Formally, there are four sorts of types in OpenEuphoria:

5.3.2 Type of a nonatom element.

   Nonatoms rely on a default type, which is their deftype metadata. Elements of nonatoms may have any type, but they should pass the type checking thus defined. They are registered ashaving this default type.

   The programmer always has the option to specify the type of an element in a nonatom using the cast primitive. When this happens, a type check of the current element using the supplied type is performed, again regardless of current type checking status.

   The cast primitive has the following form:

   cast(general_identifier,index,type)

   This a procedure call which acts on the nonatom specified as first argument. It sets the type information for element container[index] to type. type is either a type name or the routine id of the type function.

5.3.3 Declarations.

   A variable must be declared before being used. There are no exception to this principle but "for" loop indexes.

   A variable definition takes the following form:

   [global |static ]type {identifier[=value] }

, a type name followed by one or more items. These items are either variable names or name=value initialized variables.

   The variable's initial value is computed before the variable is created. This allows an identifier to shadow another one while retrieving the shadowed value at initialization time.

   The optional global keyword makes the variable(s) visible outside of their current abstract file, giving them a scope metadata of sGlobal. It is not allowed for private routines.

   The optional static keyword applies to routine private variables. It makes their values persist between invocations of the routine.

A declaration may appear in any place outside routines or blocks.

Declarations in routines must be grouped right after the routine definition, as in:

function deloddnumbers(sequence of integer s)
integer i=1
sequence s0

--you can't move any of the two lines above past here.

s0=remainder(s,2)
while i<=length(s) do

if s[i]=1 then s=remove(s,i)
else i+=1 --it is easy to forget, but definitely necessary...
end if
end while
return s0
end function

   Additionally, since routine variables are private, you cannot declare them as global.

   Section 5.4 below will show you how to relax the restrictions above.

5.3.4 Constants.

   Constants are identifiers that are assigned a value at initialization time. That value cannot change hereafter. Using constants instead of hard-coded repeated values is recommended for two reasons at least:

   Declaring a constant takes the following form, quite similar to a variable declaration:

[global ]constant {[type ]name=value}
Indeed, you can declare any number of constants in a single statement.

   This statement may appear everywhere a variable declaration is allowed. Contrary to variables, the typze secification is optional, a type of object being assumed if it is not present.

   It may happen that a constant is declared with some value even though a constant with the same name and the same value is visible. In this case, the duplicate declarations are ignored; Euphoria throws an error in this situation. Note that a constant defined inside a routine cannot be global.

   Attempting to modify the value of a constant wil raise an exception. There is no way to change the value of a constant using only OpenEuporia statements.

5.4 Variable id's.

5.4.1 The id metadata.

   Rather than being referred to by its name, a symbol can be accessed through its id metadata. Routines will have routine_id's (see Chapter 8), and variables have variable_id's. One may consider that all variables are named elements of a large sequence, and the variable_id's are indexes into this sequence.

   When a variable is destroyed in any fashion, mainly because it is a private, nonstatic variable of a returning routine, its id is not recycled. This guarantees that an id always refer to the same variable or to no variable at all, which will cause an error on assignment.

   The built-in function isvarid takes an integer and returns True if this integer is the id of a variable and False else.

   Individual elements of nonatoms have variable ids, so that "s[3][5]@id" makes sense and returns an integer you can use as shown below. The id "follows" the element it tags during the transformations of the host array/sequence, so that the returned id may well give you the contents of s[2][7] if some elements were added or removed from s[3] or s.

5.4.2 Manipulating existing variables.

Five routines are provided to handle variables through their id's:

   Note that set_var will fail if the symbol with this id is not to be written to ( var(id)@readonly != roNo ).

   Example: assume you have a variable named balance. Its value must be assigned to the variable credit if it is nonnegative, and to the variable debit if it is less than 0. You also want to print a message reflecting what has just been done. The printing format of credits may not be the same as for debits.

   A simple solution can be devised using the tools above:

baltype={var_id(credit),var_id(debit)}
...
b_id=baltype[1+(balance<0)]
set_var(b_id,balance)
msg=sprintf("Your %s is " & var(x)@format,{var(x),get_var(b_id)})

   In Euphoria (2.4 and before), you'd have to explicitly write an if statement to perform this admittedly simple task.

   Also note that, since variable id's are global, they can be used to access shadowed symbols or static private variables.

5.4.3 Creating variables on the fly.

   It may be useful to create variables in other places than in variable declarations, specially inside routines. This can be done as follows:

id=new_var(type,name,_)
id=new_var(type,name,value)

   This is equivalent to saying in the proper place type name" or type name=value", and gives you the id for this variable. Note the use of the anonymous placeholder '_' when no initial value is provided.

   Also note that variables can be created conditionally using this mechanism. You cannot new_var() a global or static variable. A variable declared in this way is private if it is inside a routine and just public else.

   Creating a symbol clashing with an existing one, or accessing an id that does not exist, are error conditions, as might be expected.

5.4.4 Deleting variables.

   As new_var() is primarily intended to create temporary variables, you may remove them once their short life span is over. This can be done as

   del_var(id)

For obvious reasons, there are limitations to use such a tool:

5.5 Using a variable.

5.5.1 Variables and values.

   If the general_identifier of a variable appears on the left side of an assignment symbol (see 5.5.2 below), its value will be subject to change:

   If a variable is passed by reference to a routine (see 8.3), the routine will modify it only if it can write to it.

5.5.2 Assignments.

   A variable may be assigned a value using its id and the set_var() routine, or using an initialization on declaration; but these are by no means the most frequent way of doing it.

   There are three ways to assign a value to a variable using assignment operators:

general_identifier assignment expression
#({general_identifiers}) assignment expression
#({general_identifiers})# assignment expression

In the first form, a variable gets (modified by) the value on the right side. The second form allows this to take place on several variables at the same time, so that they are assigned, or modified by, the same value to which the righthand side evaluates.

   The third form normally requires the righthand side of the assignment to be a nonatom. The first element of the list on the left side of the second # is assigned the first element on the right side, and so on until one or both sides run out of elements. It could be called "desequencing", as it sends the contents of a sequence to several variables. If the righthand side is an atom, it is treated as a sequence of length 1.

   To retrieve only some elements from the righthand side in intermediate position (an element of higher index is retrieved), use the "_" universal placeholder where a variable would be expected. This effevtively discard the value that would be in the assignment otherwise.

   As an example, if fIxed is an array of char and seQ is a sequence, you can assign the contents of seQ to fIxed, truncating extra characters if seQ is too long, by coding

#(fIxed)#=seQ
. If seQ is not long enough, extra characters to the right of seQ are not affected.

5.6 Aliasing an element of a variable.

   You can specify an alternate name for an element of a nonatom. This is specially handy when complex index specifications are involved. The available tools are:

name aliased as alias
rename aliased as newalias
unname alias

name supplies an alternate name for an element in an array or sequence.
rename changes an existing alias to another one.
unname makes an alias unavailable.

   Aliases, in all this section, are identifiers, while aliased has the form identifier{index specification}. They act exactly as structure members do. As a result, an element keeps its name even if its position in the host sequence changes, as long as it exists.

6 Included files and namespaces.

   OpenEuphoria adopted the open philosophy of Euphoria in the sense that a lot of functionality is to be found in libraries rather than in the language itself. The main advantage is that anyone can customize or upgrade routines in the libraries easily - they are plain text source files -, rather than tinker with the OpenEuphoria interpreter/compiler source itself, which may be written in another language. The drawbacks are loss of performance, version conflicts and symbol clashes.

   Physical files the program will look for stuff into are called included files.

6.1 Namespaces.

   Because symbols from different files may share the same name and be visible from the same location in the program, there must be a way to unambiguously refer to any of them.

   Namespaces are the way. They are identifiers that prefix the symbol name. The prefix is separated from the raw symbol name by a colon ':'.

   Namespaces apply to global symbols only. By construction, there is only zero or one public symbol and zero or one private symbol to be seen from any given location in the program. However, global symbols do not necessarily harbor explicit namespaces. Global symbols are in the default namespace.

6.2 Including and naming a file: a first approach.

6.2.1 The include statement.

   Auxiliary files are made available to the main file using the include statement:

include filename|(expression)
include filename|(expr) as namespace

   Remember that a filename is eitker a string or a parenthesized expression whose run-time value is to be interpreted as a file name. The (generated) string is passed yo the operating system as a filename as-is, and must conform to whatever syntax rules the OS enforces, like double quoting long file names with spaces in them.

   The simplest form makes global symbols in filename visible to the other files. This may lead to symbol clashes, some of them are caused by files the coder did not write. See sections 6.3 and 7 below.

   The second form allows using the prefix namespace: for global symbols in filename. Several filenames may share the same namespace.

When a file is included for the first time, its statements are executed. Other subsequent include statements relative to this file do not trigger this action.

   include statements always declare namespaces: the "default" namespace is used even when none is supplied.

   The same file may appear with various namespaces in the same physical file. This is not really a feature, but legacy behaviour. On the brighter side, various files may include the same file with different namespaces.

   Using a string enclosed between parentheses causes the string to be considered as an expression, the evaluation of which is used as a filename.

6.2.2 Namespaces.

   Namespaces are a way for a given file to refer to symbol in another given file. As a result, namespaces are known only in the file they appear after an as keyword. So, they are two sorts of symbol clashes only:

6.3 The import, promote and demote statements.

   Because of the somewhat undiscriminating nature of the include statement, Which has symbols appearing in two namespaces when one was specified, and which acts in the same way upon all symbols in the included file, another construct is needed to get a more controllable behaviour. Changing the rules for include would most likely break too much Euphoria code.

6.3.1 The import statement.

   The statement

import filename|(expression) as namespace

makes the symbols of filename appear in the namespace namespace. The symbols are not visible in the default: namespace, contrary to what the include statement does.

   A string immediately following import and enclosed in parentheses is an expression that must evaluate to a string. That string is then processed as a filenamme, just like it would for an include statement.

   Thus, import (misc.oe) as msc will look for a record called misc, with a member named oe, or a sequence misc with a named element oe. If this can be found and holds a string, this string is the filename to be imported.

   But import misc.oe as msc will look for a file called ² and will make its global visible in the namespace msc only.

6.3.2 The promote statement.

   Because it is sometimes convenient or useful to use global symbols without using prefixes, it is possible to select symbols to be promoted to the default (unprefixed) namespace.

   The supported syntaxes are:

promote "{identifier}" from namespace --grant unprefixed access to symbols explicitly specified in the list.

promote identifier from namespace --identifier is assumed to be a sequence of strings, each of them being the name of a promoted symbol as above.

promote _ from namespace --promotes all symbols from namespace.

promote but list from namespace --promote all symbols but the supplied exclusion list. The list may have any of the two first forms above.

   Promoted symbols can then be accessed as if the file they come from had been included using the longer form of the include statement.

6.3.3 The demote statement.

   Promoted symbols can be demoted, which means they still exist in their source namespace, but no longer in default. The following syntaxes are supported:

demote " {identifier}" [from namespace]
demote identifier [from namespace]
demote _ from namespace
demote but "{identifier}" from namaspace demote but identifier from namespace

allow to drop unprefixed access for the symbols listed, in an almost symmetric way as promote adds them.

   Why "almost"? because the first two forms do not need to specify namespaces.

Indeed, there is normally one symbol of each name in the default namespace, and there is no ambiguity in the command given, hence no systematic need for the extra argument.

7 Scopes.

7.1 Named scopes.

   The construct

[global ]scope identifier
... some code ...
end scope

allows to pretend that the enclosed code comes from an external file, whose name does not matter. Any code that may appear at some position in a source file may appear in a scope block at the same position.

A global symbol inside a named scope can be used:

A scope declared as global can be considered as an external file, and is assumed to have been included using the include statement, with the scope name as a namespace. For this reason, namespaces are said to be associated to abstract files, and not only to physical files.

7.2 Unnamed scopes.

   Unnamed scopes may appear in routines and, in this case, follow the same rules as code in routines.

   This is specially handy when a small part of a routine needs a few variables not needed elsewhere in the routine. For clarity of code, putting them inside a scope block help separate them from the ones most likely to be used.

7.3 The use statement.

   The global symbols of a named scope can be accessed without any prefixing by issuing use scopename.

   The use statement is always local, which means that its effects stop at the end of executiion of code in the scope or routine it appears in.


8 Routines

   A routine is a piece of code which can be called and eventually returns. The first phrase means that control may be transferred to the first executable statement of the routine. This is not necessarily the first general-code statement, since variable initializations are executable statements. The second phrase means that, when the routine is finished with its work, it returns control to the statement following the calling statement.

   On return, a routine may provide a value, be it an atom or not. If the routine returns anything, the routine call is evaluated to the returned value.

   There are several keywords to define routines, because they have different distory and role to play.

8.1 Defining a routine.

The definition of a routine involves:

   Section 8.3 below describes formal parameters of routines.

8.1.1 Routine types.

   routine is the generic word designating a piece of code with its own variables, can be called and must return, unless it terminates some process.

   A procedure is a routine which does not return any value.

   A function is a routine which must return a value.

   A type is a special sort of function. It must return a boolean, and takes exactly one argument. Specifying a type routine defines a user-defined type with the same name.

   A reftype is a special sort of function. It must return a boolean, and takes exactly two arguments: an integer, which is the id of the variable to be assigned, and the value to be assigned to it. Specifying a reftype routine defines a user-defined type of the same name.

   A handler is a routine designed to handle events. These events are triggered at run-time, and handlers are not primarily meant to be called explicitly, even though they can as any routine. The argument list of a handler is described in 11.2 below.

   Additionally, when a variable is assigned, the (ref)type function associated to it is executed prior to the assignment, and an rxception is raised if it retuns false (see Chapter 1, "Types", for details). For this reason, (ref)types may be considered as an hybrid between functions and handlers.

8.1.2 Forward declaration.

   Sometimes, it makes the code clearer to use a routine even though it was not defined yet. In order to do this, you can use the forward attribute, followed by the routine definition.

   When time has come to code the routine statements, just issue the statement routine type routine name, and go ahead with the statements in the routine. The full definition of the routine is already known, so that this shorter form is enough. You still can give the full definition again, but an error will occur if there is a mismatch between the two definitions.

   Obviously, if a routine is declared forward and no flesh is added to this bone, an esception will occur at the end of the parsing of the source file.

8.1.3 Calling a routine.

An explicit routine call is made of:

   The list of values should conform to the list of types specified in the routine definition. For instance, if foo was defined as

routine foo(integer i,string s)

, then a call to foo must be like foo(expr1,expr2). expr1 is Checked to be an integer, and expr2 is checked to be a string. If one of the checks fails, or if there are not exactly two arguments, an exception will occur.

   If the routine is to return a value, and if this value is to be used, then one of the syntaxes below must be used:

general_identifier assignment foo(i,s)
#({general_identifiers}) assignment foo(i,s)
#({general_identifiers})# assignment foo(i,s)

   Here, assignment stands for the equal sign preceded with any operator listed in 3.1.

   These forms are only special cases of corresponding forms of assignment shown in section 5.5.2 above, and don't need much more commenting, as a routine call is just another expression.

   You may ignore the value returned by a function by desequencing it to the empty list, as in #{}#=foo(i,s). #(_)#=foo(i,s) would do as well.

8.2 The return and resume statements.

8.2.1 Returning from a routine.

   To signal that a routine must terminate and return control to the statement logically following the routine call, use the return statement. return by itself just terminates the routine; return expr does so and returns the value of expr.

The concept of "statement logically following the call" is as follows:

    If a routine does not have an explicit return statement, return is assumed right before the end mark of the routine.

8.2.2 Resuming execution.

   The resume statement asks the routine to terminate and reexecute the statement which triggered the routine. For this reason, this statement is meant for exception handlers and, normally, should not be excuted when the routine is called explicitly. There is no point in returning a value here, so that the mention of a value to return is not supported.

8.3 Formal parameters of a routine.

Formal parameters are characterised by three properties:

8.3.1 Passing mode.

   A formal parameter is either a single variable name, a constant or a more complex expression like "x+1". "variable name" extends to whatever might have a variable id, like a named sequence element or record member.

   An argument which is not a variable name cannot be passed but by value.

   However, when an argument is a variable name, two actions can be performed. Proceed as above is a first option; the alternative is to let the variable be temporarily aliased by the associated formal parameter in the routine body.

   The first method is called "pass by value", and is the only method explicitly used by Euphoria. It is the default passing mode in OpenEuphoria.

   The second method is called "pass by reference", and must be explicitly enabled in the formal parameter specification.

To allow passing by reference of a formal parameter, prefix it by the update keyword in the routine definition. When calling a routine, if the n-th argument is an expression other than a variable name and is supposed to be passed by reference, it will be passed by value instead. You can use the keyword byval just before the expression to emphasize that the effect of the update keywprd is temporarily suppressed.

   When calling a routine, if the n-th argument is an a variable bame and is supposed to be passed by reference, it must be preceded by one of the keywords byref or byval; otherwise an exception will occur. The variable will be passed by reference if byref is used, and passed by value if byval is used.

   Remember that, when a variable is passed by reference, any modification made by the called routine to the formal parameter to which this variable is mapped by the routine call is reflected to this variable. When passed by value, no modification is reflected, since the routine operates only on a local copy of the variable.

8.3.2 Parameter types.

   The type of a parameter is explicitly stated or not. In the latter case, the type of the previous parameter is assumed. Hence, the first formal parameter in a routine definition must have an explicit type.

   Specifying a formal parameter as array[(size)] or sequence allows to indicate that no specific type is expected for the elements of the nonatom passed. You may give a size to an array or just leave it out.

8.3.3 Example.

forward function foo(string s,byref result,integer i) ...
seq="heLlo, woRld!"
...
x=foo(seq,append(seq,x),3)
--the second parameter will be passed by value regardless of the update keyword, as nothing can be mapped to this compound expression.


   x=foo(length(s),s,j)     --ERROR: length(s) is an integer, and the first parameter
--is a string.
   x=foo(seq,seq,0)        --this one is corresct
   ...
   routine foo    --nothing more to say, here comes the beef.
   ...
   result=lower(seq)   --after the correct call to foo, seq is "hello, world!",

--since result is the second parameter of foo, is passed by reference and
--seq is a variable name passed to foo as second argument, so that result
--aliases it.

8.4 Variables in a routine.

   Variables declared inside a routine are private and shadow any existing symbol with the same name. Formal parameters of routines have the same behaviour.

   A routine can access by name any public or global variable in scope at the time of the call, as well as its own private variables. A private variable cannot be accessed by name outside of its routine.

   On return from a routine, all its private variables cease to exist, except those declared as static.

8.5 Calling a routine.

8.5.1 The standard way.

   Explicit invocation of a routine takes the following form: the routine name, a left parenthesis, the possibly empty, comma separated list of arguments, and finally a right parenthesis.

   For instance, i=find("myself",someSequence) is a routine call with two arguments. First argument is "myself" and the second one is someSequence. It is a function-like call, since a value is retrieved from the called routine on return.

   The arguments must match the formal parameters in number and type. Failure to do so will raise a "ArgError" exception.

8.5.2 A special use of desequencing.

   You can use a sequence to represent several consecutive arguments in a routine call. To do so, the sequence must be prefixed by the # desequencing sign, and be enclosed in parentheses to avoid any risk of being mistaken for a hex number.

   Thus, if you want to issue the call foo(1,2,3,0), and you have a nonatom fooArgs at hand with the value {1,2,3}, you can issue foo(#(fooArgs),0) with precisely the same effect, which is to call the foo routine with the four arguments 1, 2, 3 and 0.

8.6 Dynamic invocation of routines.

Just like variables, routines have an index called routine_id. Five functions are provided to manage dynamic calls:

   Built-in routines, even though they are defined in no file, have routine id's which can be retrieved as for any user-defined routine.

8.7 Routines and namespaces.

   Global routines may be accessed by name when the abstract file they are defined in is included by other files. Any routine can be accessed through its routine id.

   OpenEuphoria provides quite a few built-in routines, like the length() function (counts the number of elements), the integer() type (retuns True on integers and False else) and so on. They are treated as global symbols, and also have a namespace of their own, called builtin.

   Defining a routine with the same name as an existing one in a given namespace (including default: or builtin:) generates a warning and shadows the preexisting routine With the newer one.

8.8 Routine metadata.

   The @ construct used for variables (see 6.1) is also available for routines, with a few restrictions however because some of them don't always make sense. The available Metadata for routines are:

name
the name of the routine
type
a small integer representing the keyword used to define the routine. The recommended mapping is
  1. routine
  2. function
  3. procedure
  4. type
  5. reftype
  6. handler
id
the routine id of the visible routine with the given name.
format
available for types and reftypes only. Default format to be used to print variables of this type.
types
a possibly empty array of integers, which are the routine_ids of the types of the formal parameters, in the order that they were enumerated at definition time.
scope
sGlobal or sPublic, as for variables.

   Unless stated otherwise, the meanings of the same metadata for routines or variables is very similar. The get_meta function is also available for routines (see 5.1) exactly as for variables, except that it returns a record of the reserved type of SystemRtMeta, with names and positions as in the list above.

9 Code blocks

   Code blocks are code between a blocktype statement and the matching end blocktype statement.

   Blocks are nested, which means that the order of the blocktypes and the order of the end blocktypes must be exactly the reverse of each other. Failure to do so causes an irrecoverable syntax error.

9.1 Labelling blocks.

   The statement label identifier may appear just before a code block and uniquely identifies it. The identifier can be used in instructions which control code block execution (see 9.4 and 9.5 below).

9.2 The "if" block

   Remember that this block may have a substructure:

if cond then

   general-code

[elsif cond then general-code]
[else general-code]
end if

   This will execute some general-code according to the values of the cond, according to the following rules:

9.3 The select block.

   When several courses of action might be taken according to the value of some expression, you can always stack a few elsif statements inside an if block. However, it may not be the clearest way to code this sort of situation, and this is why an alternative construct is provided. Also, you may want to take several branches in succession, which the if statement does not allow.

   The structure of a select block is as follows:

select expr

case statement
[otherwise general code]

end select

and a case statement is as follows:

case expr:
general code
case rel_op expr:
general code
case expr thru expr:
general code
case condition:
general code

9.3.1 The selector.

   The expression following the keyword select is evaluated, and this value is called the selector of the block. Decisions will be made according to its value. Each branch of the decision tree is represented by a case statement; an otherwise branch may be there as well.

9.3.2 The case statement.

   The keyword case may be followed by four different types of items:

9.3.3 Instruction flow inside a select block.

   Each case statement starts by (a simplified form of a) conditional clause. The code following a case statement is executed whenever the corresponding condition is true. After that, if the block was not exited, the next case statement is inspected.

   The process goes on until one of the three mutually exclusive situations happens:

9.3.4 The otherwise statement.

   This statement is optional, and is allowed only inside a select block. It may appear at most once in a block.

   If it is reached, and one of the branches of the select block was taken, the block is exited; otherwise, execution continues past the otherwise statement.

9.3.5 The stop statement.

   Allowed only inside a case branch, it causes that branch to be exited and the next case condition to be tested. If there is none left, the select block is exited.

9.4 Loops.

9.4.1 The for loop block.

   The complete syntax is as follows:

for identifier=start value to end value [by increment] do

i>general code

end for

   When the for statement is reached from outside the block, start value, end value and increment are computed. If the "by" clause is not present, the increment is set to 1. They all must evaluate to an atom. These values are not computed again during the susequent loop iterations.

   The loop index variable identifier must not have been declared, and is assigned the start value. It cannot be modified.

   If (start value-end value)*increment is greater thn zero, no iteration is performed and the loop is exited, as there is no way for the index variable to get closer to the end value. Otherwise, the first iteration starts.

   If the index is not between the start and end values, and if the for statement is reached from inside the loop, the loop is exited without any further iteration. Execution resumes right after the end for statement. The loop index variable remains availabe until the next for statement using the same identifier as its index variable. Otherwise, a new iteration starts.

   When the end for statement is reached, the index is incremented by the increment and control is transferred to the for statement.

9.4.2 The while loop.

   while cond do general-code end while

   Executes an iteration of the loop if cond is true. Otherwise transfers control right after the "end while" statement.

   The end while statement makes the while statement to be executed.

9.4.3 The wfor loop.

   The complete syntax is as follows:

wfor identifier=start value to end value [by increment] do

general code
end wfor

   This loop is a sort of hybrid between a for and a while loop, hence the wfor name.

   If the by clause is not present, the increment is set to 1. These values are computed whenever the wfor statement is executed, and always must evaluate to atoms.

   The loop index variable identifier must have been declared. It is an ordinary variable which may be assigned inside the loop.

   If the index is not between the start and end values, the loop is exited without any (further) iteration. Execution resumes right after the end wfor statement. Otherwise, a new iteration starts.

   When the end wfor statement is reached, the index is incremented by the increment and control is transferred to the wfor statement.

9.5 Exiting blocks.

   Exiting a block means that the next executed statement is the one Following the end blocktype statement which ends the block.

   A code block will be said to be "active" relative to this statement if it contains the statement.

9.5.1 Exiting keywords.

   The exit statement can be used to exit a loop, the exif statement can be used to exit an if block, and the break statement allows to exit a select block.

   They all have an optional argument. If they don't, the current relevant block is exited. Otherwise, the specified block (see below) is exited.

9.5.2 Optional argument for exiting keywords.

   The optional argument of an exiting keyword is either a number or an identifier. The phrase "relevant block" translatse as "loop block" when referring to an exit statement, an if block when referring to an exif statement and as select block when a break statement is involved.

   If the argument is an identifier, it must be a label tagging an active relevant block. Labels are dropped using the label statement in 9.1 above. Failing this consistency criterium raises an exception. Otherwise, the block tagged by this label is exited.

   If the argument is an integer greater than zero, this number is the number of relevant blocks nesting the current one that must be exited. Thus exit 1 means "exit the active loop above the current one", exit 2 exits the loop above the one above the current one, and so on.

   If the number is negative, then the active relevant blocks above the current one are counted backwards from the topmost one to determine the block to be exited. Thus, exit -1 means "exit the topmost active loop block", exit -2 means "exit the active loop just below the topmost one", and so on.

   An argument of 0 is ignored, as it would only emphasize that the current relevant block is to be exited.

9.6 Iteration control for loops.

   A loop iteration can be stopped at any point during its execution using the keywords next or retry. These keywords accept the same optional argument as exit.

9.6.1 The next statement.

   This statement causes a new iteration of the loop to occur. This means that control is transferred to the opening statement of the loop, causing index update in for or wfor loops, and condition evaluation ina while loop.

9.6.2 The retry statement.

   This statement causes the current iteration of the loop to start again. This means that control is transferred to the first statement inside the loop block. Thus, the index of a for or wfor loop is not updated, and the condition of a while loop is not evaluated.


10 The built-in debugger.

   A run-time debugger makes it extremely easy to debug a program, much easier at least than scattering a few print() statements and having to guess what is going wrong in program flow, variable assignments and other issues.

   The integrated debugger is enabled by the with trace statement, and completely turned off by the without trace statement. This default behaviour saves execution time.

   If the debugger is enabled, you start it by the trace(1) statement, and turn it off from the running program by the trace(0) statement.

   A command/status line will be also provided, as the debugger may process user input (see 10.2 below) and display some information to the user.

10.1 Debugger screen.

10.1.1 General description of the debugger screen.

   The main debugger screen shows about 15 lines of code, highlighting the one to be executed. This line will remain about the middle of the screen most of the time, so that some code before and after it can be seen always. It will be called the active line. Another line may be highlighted in some other way, and will be called the spot linr.

   Another part of the screen is reserved to show the values of most Recently accessed variables. These values are updated as source statements are executed.

   The debugger must be implemented in such a way that it will not trace itself, nor trace events it may (cause to) trigger.

10.1.2 Available keystrokes.

The following actions should be requested using one-key keyboard shortcuts:

10.1.3 Other commands.

   Scrolling through code using the mouse buttons, movements or wheel actions is to be provided.

10.2 Debugger commands.

   Rather than immediate actions, the following are commands aimed at inducing specific behaviour from the debugger, or to set some trace scheduling.

10.2.1 Dynamic reakpoint.

   The b command allows you to enter a conditional expression. This expression must be a valid OpenEuphoria condition. This condition sets up a dynamic breakpoint, which is triggered any time the condition is true. The expression may use any variable in scope at the time it is defined. Whenever one of these variables gets out of scope (for instance, returning from a routine), the dynamic breakpoint is disabled.

   This breakpoint is independent from the static brakpoint F8 toggles on and off.

10.2.2 The ? command.

   The ? command allows you to enter a valid OpenEuphoria expression. This expression will be treated as a most recently modified variable and displayed as such.

10.2.3 The s command.

   The s command will prompt you to enter an OpenEuphoria expression. if this exoression is not among the displayed variables, it is added as the ? command would. Moreover, a text box will open up and display a good deal of the expression value, quite more than ? would have allowed. The S command (10.1) closes the box.

10.3 Status report.

   The status line referred to in 10.1.1 will display information about the state of the static and dynamic breakpoint, as well as the indication of which one was last triggered.


11 Event trapping and exception handling.

   Exceptions are situations which most likely come from an error. Exceptions will cause default or user-defined routines - handlers actually - to be executed when available. This way, the program knows that something possibly went wrong and may take corrective action as needed to avoid or soften the crash.

   Events are actions taken by the machine code being executed. Trapping them, also using handlers, allows to be informed of what is going on. Such hooks are of obvious use for debugging or profiling purposes, but they may serve many more useful programming needs as well.

   Because the same mechanism is used in both contexts, the term of event will be used to refer to both exceptions and program events indifferently.

11.1 Assigning a handler to an exception.

   This is done using the statement set_handler(expr, expr)". The first expr must resolve to the routine id of a handler. The second expr must resolve to a string representing an event name.

   get_handler(expr), where expr resolves to an event name, returns the id of the handler for this event.

   A handler is called with five parameters:

11.2 Events.

   The following table lists the events and exceptions that call a handler, the parameter they pass as a last argument and the default handler action.

Name Last parameter Default action
AfterAssign {} does nothing
AfterCall argument list does nothing
AfterIndex {} does nothing
AfterRead value returns the value
AfterReturn {value}, or {} if none does nothing
AfterWarning {warning text,warning code} does nothing
ArgError {argument #,value} aborts
BeforeAssign value calls type checking code, possibly
issuing TypeError, or does nothing.    
BeforeCall argument list checks types, possibly issuing ArgError
BeforeExecute statement text does nothing
BeforeIndex {} conveerts indices to positive,
possibly issuing IndexBounds
BeforeRead {} does nothing
BeforeReturn {value}, or {} if none does nothing
BeforeWarning {warning text,warning code} displays the warning
IndexBounds value aborts
RaisedError error message prints the line # and the supplied
error message, then aborts
RuntimeError {statement text,error code} aborts
SyntaxError statement text aborts
TypeError value aborts
UnknownToken statement text aborts
ZeroDivide {} aborts

An assignment is preceded by a BeforeAssign and followed by an AfterAssign.
Reading a variable is preceded by a BeforeRead and followed by an AfterRead. Calling a routine is preceded by a BeforeCall in the current scope and followed by an AfterCall in the routine scope.

   Returning from a routine is preceded by a BeforeReturn in the routine scope and followed by an AfterReturn in the callong statement's scope.

   Accessing an element in a nonatom is preceded by a BeforeIndex and followed by an AfterIndex.

UnknownToken is issued by the interpreter when it does not recognise a token.

RuntimeError is issued on a variety of reasons.

ZeroDivide is called when a division by zero happens.

RaisedError is called only by the "error" primitive.

11.3 The error procedure.

   This procedure takes a string as its argument, and passes it, as xell as the usual four other arguments, to the RaisedError handler. Both second and third arguments are zero.

11.4 The resume_execute() and return_execute() primitives.

   It may be desitable, when a resume and return instruction is executed, to execute a dynamically generated statement after leaving the handler, but before the standard action being taken. The argument for resume_execute() and return_execute() is an expression which is fed to execute() at the appropriate time.

Example: assume that a string is being scanned, and its length may vary in the process. A while or wfor loop may do the trick, except that, after almost any statement modifying the scanning index or the length of the string, a check must be performed to avoid index out of bounds exception.

   A clean solution then is to instruct the relevant handler to quietly exit the loop whenever this condition happens.

   Thus, one may code:

IOBhandler=get_handler("IndexBounds")
handler IndexBounds(integer varid,sequence subex,integer indexid) if varid=scanned@id then return_execute("exit")

--when scanned is subscripted with an out-of-bounds index, just exit

else call_proc(IOBhandler,{varid,subex,indexid})

--otherwie, chain to previous handler

end if
end handler

--now the loop
i=1
while i<=length(scanned) do
....
--code that no longer needs repeated checks like --"if i>length(scanned) then exit end if"
...
end while
set_handler("IndexBounds",IOBhandler) --restore previous handler

   The code inside the loop got rid of repeated checks and is clearer and leaner as a result. There is hardly any performance loss, since the handler is invoked only on an error condition.


12 External OOP support.

   Object oriented programming, or OOP, is not directly built in OpenEuphoria. External libraries will get notifications of OOP syntactic constructs and will have to implement these constructs.

   The OOP library is to be included in the reserved "OO" namespace. Normally, Functions of the library are not called directly; the interpreter plugs in the appropriate OO calls, like a preprocessor would.

12.1 Recognised constructs and their translations.

start a class definition:

Declares a private part of a class:
ActionOE syntaxTranslation
Starts a class definitionclass Identifier constant Identifier=OO:begin_class()
End a class definition:end classOO:end_class()
private doOO:begin_private()
Ends a private part of a class: end privateOO:end_private()
Declares a public part of a class: public doOO:begin_public()
Ends a public part of a class: end publicOO:end_public()
Declares a protected part of a class: protected doOO:begin_protected()
Ends a protected part of a class: end protectedOO:end_protected()
Apply a method to an object Identifier1->Identifier2({Epr}) OO:call_method (Identifier1,Identifier2,{{Expr}})
Get a member from a class: identifier1->identifier2OO:get_member(identifier1, identifier2)
Set a member from a class    identifier1->identifier2=expr OO:set_member(identifier1,identifier2,expr)




prev | next | contents