Prev | Next | Contents


Part A: STRUCTURAL PRESENTATION.

   Throughout this documentation, written text may have several meanings which must be carefully separated. In order to do so, some graphical conventions will be used:

   The vertical bar ( | ) will denote a choice to be made among a finite number of options, like in for|if|while.

1 Types.

   The type of an atomic entity is defined by a basic type and, optionally, by a validation function. Nonatomic types are defined by the way other types are grouped together.

   Actuaally, OpenEuphoria uses two distinct, not always separated, notions of a type. A value has a type when the function that recognizes that type, and bears the same name, returns True for this value. A variable has a type when it can hold values with that type.

   Thus, the phrase "to pass a type check" is much more precise than "to have a type". Please consider that the latter is just a condensed way of writing the former.

1.1 Atomic types.

   The basic built-in atomic types do not need explicit user definition. They are:

   Note that no reference to actual size is being made. The programmer must know that he can use 32-, 64- or 128-bit integers, but the source needs not to. If it does, it actually refers to some bit dtream representation of an integer. Such objects are dealt with using raw memory structures.

1.2 Nonatomic types.

   Nonatomic types group together other types in various ways. The supported layouts are:

   The type string is a shorthand for sequence of char. The type fixedstring(number) is a shorthand for array(number) of char.

   In a somewhat related way, arrays are just special kinds of sequences, so the only assignment problem is assigning a sequence to an array when lengths don't match. This will be handled by the fit() procedure (see 3.7.3.3 below).

Please also note that the object type can accomodate any nonatomic as well as atomic data.

1.3 Added validations and user-defined types.

   A user-defined type is a refinement of (a refinement of (...)) one of the above types, defined through a validation function. Such a function is defined using the keyword type or reftype as a routine type. It may have side effects, and must return an atomic value.

A variable x has the user-defined type mytype if:

   Please note that variables carry user-defined type information, but values don't. This subtle difference will surface in 5.3.2 (assigning elements to nonatoms and tracking their types).

1.4 Context types.

   The type of a member of a record may be specified using the context information of the other members of the record.

   To do this, the member type must be declared as "_" in the record declaration. Later, the type declaration has the following form:

[global ]type|reftype record_name.member_type_name(parameters)
...
end type|reftype

parameters has one or two parameters (see 8.1 below).
To refer to members of the structure, use the .membername syntax.

Example:

record stringWithIndex(

string s,_ index)

--type of the index member will be defined later as StringWithIndex.index

end record

...some possibly unrelated code ...

type stringWithIndex.index(integer i)

return i>0 and i<=length(.s)

end type

--index must be a positive integer not greater than the length of the s member of the same structure.

   A reftype might have been used as well.

1.5 Type checking.

1.5.1 The general case.

   When a variable is going to be assigned a value, it may be checked that the variable can hold such a value. To determine this, the type or reftype function attached to the variable is called, and the check succeeds if it returns True. Any nonzero atom is interpreted as True. A failed type check causes an exception. The handler for this exception may take corrective action or just let the running program abort.

   When the with type_check directive is in force, this process is performed on every assignment. As this impacts performance, systematic type checking may be turned off. For obvious reasons, type checking will still take place at a few implementation specific places.

1.5.2 Forced type checking.

   When systematic type checking is turned off, you may wish to keep some control over which type checks are performed, because you need the checking - for instance, when the type functions have side effects.

   To this end, you can add the keyword check as a prefix to a (ref)type. Types thus earmarked are always checked, even when type checking is off.

   You can further restrict checking to some variables by creating a pair of twin types, one without the check prefix and one wrapping the latter, but with the check prefix.

   You can further fine tune the checking by having user defined types that return True (check passed) unless some condition is met, in which case some real action takes place instead.

1.6 Type aliasing.

   You can give alternate names to types. This may enhance code readibility, as the same type may have several interpretations in the same program. But it is mostly needed to call type checking functions for types with a compound name, like sequence of integer. You can do this using the statement:

type | reftype altname is aliased

   This creates an alias and a function. The alias altname can be used wherever aliased could be used. The altname() function checks the appartenance to the type aliased; it is a type or a reftype according to the statement used.

   Note that you can use compound types as typechecking function identifiers. So, the call

(sequence of array(4) of integer)(x)
is valid and checks whether x is a sequence of array(4) of integer. This scheme does not extend to other functions: their identifiers are only made of contiguous allowable characters.

2 Basic tokens.

2.1 Identifiers.

   An identifier is a string of consecutive letters, digits and underscores, starting with a letter. What a letter precisely means depends on implementation, but always includes the ranges 'a'-'z' and 'A'-'Z'.

   Lower and upper case letters are different; contrary to some other languages, OpenEuphoria is case sensitive, which means that the exact spelling of an identifier is taken into account.

   So, "var" and "VaR" are different, valid identifiers, while "_top" or "2read" are not valid. "3z2" may or may not be valid, according to implementation specific rules, and espcifically qhether they treat Ý as a letter or not. Euphoria only admits the minimal specificaion above as defining a letter.

2.2 Quoted characters.

   A quoted character is a single (double-byte) character inside simple quotes, or an escape sequence, also inside simple quotes. Supported escape sequences are:

\' simple quote
\" double quote
\n newline (ASCII 10)   (you cannot use \N)
\r carriage return (ASCII 13)
\t tab (ASCII 9)
\\ backslash
\(number) the character with ASCII/Unicode code number.

   Thus, 'f', '\n' or'\(3245)' are all characters.

2.3 Text.

   Text is anything between double quotes. It is not processed at all, except for escape sequence resolution.

   Verbatim text is enclosed between matching groups of three consecutive double quotes ("""). In verbatim mode, any character, including whitespace, is considered as part of the string. There is no special meaning for the backslash character, and there is no escape sequence processing as a result.

   There is also an intermediate long text mode. Strings in this mode are enclosed between matching $" and "$. line_end characters '\r' and '\n' are ignored in long text mode, but other characters, including escape sequences, are treated as in normal text mode.

Example 1:

"This is a very long string which "&
"has to be broken for readibility reasons."

could be written as:

$"This is a very long string which
has to be broken for readibility reasons."$

Example 2:

"This is a very long string which \n"& "has to be broken for readibility reasons."

could be written as:

"""This is a very long string which
has to be broken for readibility reasons."""

2.4 Numerical items.

   They fall into four categories:

The number must be in decimal digits.

   Internally, OpenEuphoria performs as many automatic conversions as it needs, taking advantage of available hardware, to minimize memory usage by numerical items, while retaining the precision of these numbers.

2.5 The infinity.

   Infinity is a mathematical concept alien to computers, as computers execute a finite number of machine instructions on finite size data; otherwise they are said to hang and must be switched off.

   However, the infinity concept is embodied in some floating point numbers, and has formal properties that can be implemented on computers.

   The [+]inf and -inf symbols are atoms with the following unusual properties:
Left operand Operator Right operand Result
any atom + +inf = +inf
any atom + -inf = -inf
+inf + any atom = +inf
-inf + any atom = -inf
+inf + +inf = +inf
-inf + -inf = -inf + -inf = -inf
+inf + -inf = <error>
-inf + +inf = <error>
any atom - +inf = -inf
any atom - -inf = +inf
+inf - any atom = +inf
-inf - any atom = -inf
+inf - +inf =
+inf - -inf = +inf
-inf - +inf = -inf
-inf - -inf = <error>
any positive atom * +inf = +inf
any positive atom * -inf = -inf
any negative atom +inf = -inf
any negative atom * -inf = +inf
+inf * any positive atom = +inf
-inf * any positive atom = -inf
+inf * any negative atom = -inf
-inf * any negative atom = +inf
[+|-]inf * 0 = <error>
0 * [+|-]inf = <error>
+inf * +inf = +inf
+inf * -inf = -inf
-inf * +inf = -inf
-inf * -inf = +inf
+inf / any positive atom = +inf
+inf / any negative atom = -inf
-inf / any positiv atom = -inf
-inf / any negative atom = +inf
[+|-]inf / 0 = <error>
[+|-]inf / [+|-]inf = <error>
any atom / [+|-]inf = 0

   The <error;> in the rightmost column is the ZeroDivide exception when dividing by 0, and MathIndeterminacy otherwise.

   Other mathematical functions may act on the infinity symbols, with sometimes useful results:
Function Symbol Result
abs [+|-]inf +inf
arcsin, arccos [+|-]inf ArgError
cos, sin, tan [+|-]inf MathIndeterminacy
scale2, scale10 [+|-]inf +inf
log, sqrt +inf +inf
log, sqrt -inf ArgError
exp +inf +inf
exp -inf 0
arctan +inf PI/2
arctan -inf -PI/2

   The power function raises specific issues:
Base Expnent Result
0 < atom < 1 +inf 0
1 [+|-]inf 1
1 < atom, +inf +inf +inf
0 [+|-]inf MathIndeterminacy
0 < atom < 1 -inf +inf
1 < atom,+inf -ifn 0
any negative atom, -inf [+|-]inf MathIndeterminacy
+inf any positive atom, +inf +inf
+inf any negative atom, -inf 0
[+|-]inf 0 MathIndeterùinacy
-inf positive even integer +inf
-inf positive odd integer -inf
-inf negative integer 0
-inf any non integer MathIndeterminacy

2.6 Comments.

   Comments may appear at the end of any physical line of a source file. If the line was empty, it may start the line.

   A comment starts by the characters -- and extends to the physical end of line (the next line_end).

   OpenEuphoria does not process comments in any way, and comments don't affect the code generated from the source file they appear in. The facility is provided in order to document your code so that others, or possibly yourself, find understanding the code a relatively easy task, so that it can be maintained or upgraded fairly easily. Time spent commenting code will often bring a large reward in terms of cuts in maintainance and debugging time, if nothing else.


   Precise, concise, relevant, useful commenting is an obscure art that may make the difference between ordinary and outstanding coders.



3 Operations.

   They are defined by the use of infix or prefix operators, as opposed to routine calls, which use prefix identifiers acting on a list of arguments enclosed between parentheses.

   Remember that an infix notation is one that goes in between its operands (like the usual multiplication), while a prefix notation appears before its operands.

3.1 Supported operators.

3.1.1 Supported operator list.

   They are:

+ addition of numbers
- substraction of numbers
* multiplication of numbers
/ division of numbers
& concatenation of sequences
&& bitwise and
|| bitwise or
^ binary negation
~~ bitwise xor
<< binary left shift
>> binary right shift
>>> binary signed right shift

3.1.2 Precedence hierarchy.

   When more than two operators appear in a row, without parentheses separating them, there is a choice to be made: which operation to perform first? This is an important question, since the results may differ.

   There is a predefined set of rules to help OpenEuphoria interpreter make a reasonable guess. The rules may be overridden using parentheses to force another evaluation order.

   Here is the chart of operator precedence:

highest precedence: routine calls
  unary +/-
  bit-level operations
  * /
lowest precedence: &

   Routine calls are evaluated first, then parenthesized expressions, starting at the deepest nesting level of them. The order of evaluation of items of the same precedence is undefined.

   Thus, for instance, 3+2*4 is the same as 3+(2*4). To perform the addition first, code (3+2)*4.

   Also, if the function f sets x to 3 whatever its argument, x+f(x) is 3+3=6 regardless of what x is.

3.2 Extension to nonatomic types.

   If exactly one operand of any of the above operator is not atomic while the other operand is, the operation will be performed on each of the elements of the former operand. So, adding 1 to an array means adding 1 to each array element.

   If both operands are nonatomic and have the same length, the operation is performed on each pair of matching elements in turn. So, {3,5}+{2,-4}={3+2,5-4}. An exception will occur on length mismatch.

   Contrary to Euphoria, this scheme does not automatically extend to logical operators. The with seq_compat or with RDS directive turns the legacy behaviour on and off at will.

3.3 Formation of nonatomic objects.

   The construct {{expression}} creates an object of non atomic datatype whose first element is the first expression in the list and so on. {} denotes an empty nonatomic object.

   For this purpose, structures are ordered in the way their elements were declared in the structure definition of their type.

   "" denotes an empty string.

3.4 Accessing elements of nonatomic objects.

   Single elements of nonatomic objects (or nonatoms in the sequel) are accessed using an index enclosed between square brackets, as in: ThisList[4]. Note that any nonatom, even the returned value from a function, may be indexed.

   Structures have named parts, or members, which are used to access them using the syntax: record name.member name, like in: ThisCustomer.name .

   Since structure fields are declared in an ordered way, records also support indexed accessing: the index n then refers to the n-th field in the declaring enumeration.

   Indexes may be negative, in which case the elements are counted backwards. So, ThisList[-1] is the last element of ThisList, ThisList[-2] the second last and so on.

   You can use floating point numbers as indexes. They are rounded to the next integer downward before any further processing. So, s[-0.3] is s[-1].

   0 is never a valid standalone index. See section 3.5 below for valid uses of 0 in index specifications.

   Indexes whose absolute value are greater than one plus the length of the container they index always cause an exception. 0 and +/-(length(container)+1) are only allowed when specifying an empty slice (see 3.5.2 below).

3.5 Staticly accessing parts of nonatomic objects.

3.5.1 Nonempty slices.

   Accessing several elements in a row is possible, and is done through slices. A slice is a comma separated list of indexes and ranges. A range is specified as lower..upper, where lower and upper are the lower and upper desired index values. Obviously, the latter is not less than the former, after conversion to positive standard indexes.

   So, the statement:

NewList=ThisList[3,1..-4,-2,3..6]

   generates a list formed of elements of ThisList, in the following way:

NewList[1] is ThisList[3]
NewList[2] is ThisList[1]
NewList[3] is ThisList[2]
...
NewList[-5] is ThisList[-2]
NewList[-4] is ThisList[3]
...
NewList[-1] is ThisList[6]

   Of course, if an element of a non-atom is a non-atom itself, several square bracketed index specifications may follow one another, like in: mymatrix[1..3][4]. This is a sequence of length three, exactly {mymatrix[1][4],mymatrix[2][4],mymatrix[3][4]}.

   For structures, use names rather than indexes, even though they are just as valid ways to access structure parts. For instance, the following

CustList[27..41][name,zipcode,nbOrders]

will generate a sequence of data extracted from a 15-element subsequence of CustList starting with the 27-th. We assumed that CustList is a sequence of Customers, which are structures the declaration of which involves members named name, nbOrders and zipcode. The statement above generates a sequence since the type of all its elements is the same. Each of its element is a sequence (of object) of length 3, since it is quite likely formed by a string, another string and an integer.

   name has a rank in the enumeration of fields that build the Customer type. If that rank is 3, you could code

CustList[27..41][3,zipcode,nbOrders]

with exactly the same meaning as above. However, if that rank changes in future versions of your program, the "3" index will have to be changed to its new value, while the field name would remain the same. This is why using names is recommended over using indexes when possible.

3.5.2 Empty slices.

   You may specify empty slices when they are made of ranges the upper value of which is exactly one less than the lower value. One of the index values must be valid though.

   Thus, s[2..1], s[4..-5] and s[1..0] are all empty objects, assuming length(s)=7 so that -5 reads as 3. The last example is the only valid use of 0 in indexes; any other situation causes an exception. In the same vein, s[2..3,1..0] has length 2, while s[3..2,1..0] is empty, since the set of selected indexes is the union of two empty sets.

    But s[13..12] causes an exception, since both index values are way out of range.

3.6 Dynamically accessing parts of non-atoms.

   If s has nonatomic type and if t has the format described below, s[[t]] is a valid syntax for a part of s with variable index depth. This syntax is specially handy for tree management. s[[t]] is s[t[1]][t[2]]...[t[length(t)]].

   Each t[i] must be a sequence made of atoms and sequences of length 1 or 2. Atoms are converted to sequences of length 1, and both specify single indexes. Sequences of length 2 stand for slices in an obvious lower..upper way.

   For instance, assume t={2,{-1,{3,4}},{{1,4}}}. Then

s[[t]]=s[2][-1,3..4][1..4]

   This sequence has three elements, each of which is of length 4. Each of them consists of the 4 first elements of elements of s[2]. The first of these elements is the last of s[2]; the second and third are respectively third and fourth in s[2].

   The sequence t is said to be a subexpression representation for s[[t]] relative to s.

   Additionally, note that a list of indexes may come from a sequence through desequencing (see 5.5.2 below), so that s[#(t)] stands for s[t[1],t[2],...,t[-1] ].

3.7 Manipulating nonatoms.

   It is always necessary, once nonatoms are created and populated. While these manipulations can be done through a limited set of operations and routines stored in external files, this implies loss of performance and frequent reinventing of the wheel. For these reasons, OpenEuphoria provides quite a few built-in handling routines for nonatoms.

3.7.1 Getting information about nonatoms.

   Functions are provided in order to know how many, and which, elements are in the nonatom:

length(target)
is the number of elements of the nonatom target.
find(what,target)
returns 0 if what is not an element of target. Otherwise, returns a positive integer, which is the index the first element of target to equal what.
find_all(what,target)
returns the possibly empty sequence of all integers which are indexes of elements of target which equal what.
match(what,target)
returns the lowest integer such that some slice of target starting at that position equals what. If there is no such integer, and what is not the empty sequence, 0 is returned. If what is the empty sequence, -1 is returned if target is not empty, and -2 if it is.

If what is an atom, match returns as find would.

match_all(what,target)
returns the possibly empty list of all integers i such that some slice of target starting at i equals what. Always returns the empty sequence if what is the empty sequence.
wildcard_match(pattern,subject)
Returns True if subject mtches pattern nd False otherwise. pattern may contain '?' and '*' wildcards.
wildcard_file(pattern,subject)
Same as wildcard_match, but platform-dependent, as the matches are case sensitive only on Linux/FreeBSD systems.

3.7.2 Adding elements to sequences.

   Elements may be added to nonatoms as single objects or sequences, at any position in the sequence.

   Here is a list of available routines:

the & operator.
If s1 and s2 are nonatoms, s1 & s2 is a sequence of length the sum of the lengths of s1 and s2. The length(s1) first elements of s1 & s2 are those of s1; they are followed by those of s2.
append(s,x)
is a sequence obtained by adding x to s as its last element. Its length is length(s)+1 whatever x is.
prepend(s,x)
is a sequence obtained by adding x to s as its first element. Its length is length(s)+1 whatever x is.
insert(target,places,added)
is a sequence where the elements of nonatom added are inserted as single objects inside the nonatom target at the locations given by the nonatom places. The length of the returned sequence is the sum of the lengths of target and added.
places must be strictly increasing; strange, but sometimes desired effects may result otherwise. places may be an atom, in which case it is converted into a sequence of length 1 before further processing.
insert_sequence(target,places,added)
is a sequence where the elements of nonatom added are inserted as sequences inside the nonatom target at the locations given by the nonatom places. The length of the returned sequence is the sum of the length of target, the lengths of the nonatoms of added and the number of atoms in added.
places has rather being strictly increasing; strange, but sometimes desired effects may result otherwise. places may be an atom, in which case it is converted into a sequence of length 1 before further processing.

3.7.3 Removing elements from sequences.

   Three functions are provided:

3.7.3.1 The remove function.

    remove(target,places) returns the sequence target from which the elements whose index belongs to the sequence of integers places were removed, notionally starting from the last. places is assumed to be sorted in ascending order; strange, but sometimes desired results might happen otherwise.

3.7.3.2 The replace function.

    replace(target,places,items) returns a nonatom of the same type as target. It is obtained by removing from target the slices specified in places and replacing the carved out slices by elements of the sequence of nonatoms items, inserted as sequences. Each element of the sequence places is a pair of integers, the lower and upper bounds of each slice to process. items must have the same length as places, as each slice specified in places is replaced by the element at the same position in items.

   If places has the form {i1,i2}, where i1 and i2 are integers, this is converted to {{i1,i2}} first. If places is just an integer i, this is changed to {{i,i}}.

3.7.3.3 The fit procedure.

   The call fit(target,source,padding) causes source to be copied to target even though the lengths may not match. In this case, an ordinary assignment would have raised an error.

   If length(target)<=length(source), only the portion of source that fits into target is copied, effectively discarding elements of source with higher indexes.

   Otherwise, if padding is a char, the elements of target in excess relative to source's length are replaced by that char. padding may have the special value _, in which case these elements of target remain unchanged.

3.7.4 Permutations on non-atoms.

   Nonatoms are ordered sets of elements; so, they can be reordered. As the number of permutations on a given number of symbols rapidly oncreases with that number, it is neither practical nor efficient to directly specify a permutation of a nonatom. However, the following functions cover the most frequent cases and can be combined into any sort of shuffling.

reverse(target)
returns the nonatom target with its order reversed: the first element becomes the last, the second element becomes the second last, and so on.
move(target,start,end,where)
moves target[start..end] to position where in target. An error will occur if the shifted slice extends past the end of the nonatom, ie if where+end-start-1 is greater than length(target).
sort(target)
Returns target sorted in increasing order.
custom_sort(sort_id,target)
Returns target, sorted in increasing order from the sorting routine's standpoint. Like compare, this routine takes two arguments and returns an integer in {-1,0,1}. The sorting routine has sort_id as its routine_id. routine_id is explained here.

3.7.5 Slice transfers.

   Also called destructive assignments, they allow to simultaneously remove a slice from a nonatom and make it appear somehow in another one. The four following procedures perform this task:

cut_paste(target,position,source,start,end)
Removes the slice [start..end] from source and copies it onto target, starting at position. An error occurs if the pasted slice extends past the end of target.
transfer(target,position,source,start,end)
Removes the slice [start..end] from source and inserts it into target, the first element of the cut slice appearing at position.
transfer_as_one(target,position,source,start,end)
Removes the slice [start..end] from source and inserts it into target at position as a new element.
swap_slices(target,t_start,t_end,source,s_start,s_end)
Swaps the slices source[s_start..s_end] and target[t_start..t_end]. The slices need not have the same length.

   Since these routines involve moving data away from its original location, their use may require a higher level of carefulness than other more classic nonatom handling tools.

3.7.6 Other operations on nonatoms.

   Here are a few routines operating on nonatoms that didnt seem to fit in previous sections:

repeat(what,howmany)
Returns a sequence of length howmany, each element of which equals what.
repeat_pattern(what,howmany)
Returns a sequence of length howmany*length(what), made of howmany copies of what concatenated together. Acts as repeat above if what is atomic.
lower(string)
Convert all chars from string to lower case.
upper(string)
Converts all chars from string to upper case.
sprint(anything)
Returns the flat text representation of anything as a string. The I/O procedure print does the same job, but sprint outputs to a string and not to a I/O channel.
sprintf(format,values)
Outputs the result of replacing, in fomat, each format specifier by the value at the right position in values printed with this fomat.
value(string)
Reads a valid flat text representation of an object from string as get would from an I/O channel, and returns a pair {status,value} as get would.

3.7.7 A further note about slicing.

   Euphoria 2.4 only allows slicing once, as the last operation after any number of simple subscritings. This allows to see subscripting and slicing as operators, as t=s[1][2..3] is the same as s1=s[1] t=s1[2..3].

OpenEuphoria lifts the restriction, at the cost of considering slicing and subscripting as operators. The syntaxes allowed in Euphoria keep the same meaning in OpenEuphoria. However, s[2..3][1] is definitely not the same as s1=s[2..3] t=s1[1]. The latter is a convoluted manner of assigning s[2] to t. s[2..3][1] is meant to be a two element sequence, namely {s[2][1],s[3][1]}.

4 Condition evaluation.

   Conditions are made of clauses linked together by logicals. A condition must evaluate to a boolean value of True or False. 0 stands for False, any other atom stands for True.

4.1 Truth tables for logicals.

   A truth table is a table assigning a boolean return value to any couple of booleans. To draw truth tables esaily, we'll represent True by T and False by F.

4.1.1 The "and" operator.

      ! F ! T !
   ---+---+---!
    F ! F ! F !  Read: "and" returns False, except when both arguments are True.
   ---+---+---!  In that case only, it returns true.
    T ! F ! T !
   -----------!

4.1.2 The "or" operator.

      ! F ! T !
   ---+---+---!
    F ! F ! T !  Read: "or" returns True, except when both arguments are False.
   ---+---+---!  In that case only, it returns False.
    T ! T ! T !
   -----------!

4.1.3 The "not" operator.

      !   !
   ---+---!
    F ! T !  Read: "not" returns True if its argument is False, and False
   ---+---!  otherwise.
    T ! F !
   -------!

4.1.4 The = operator.

      ! F ! T !
   ---+---+---!
    F ! T ! F !  Read: "=" returns True when its operands have the same boolean
   ---+---+---!  value; else it returns False. This is the truth table of the
    T ! F ! T !  "not xor" logical operator.
   -----------!


4.2 Short-circuit evaluation.

   From close inspection of the tables above, it follows that you need not always compute both arguments of a logical to know its return value; computing the first one is often enough.

   This saves useless instruction execution, and may greatly simplify programming. The short-circuit rules are:

  • The second argument of "and" is computed if and only if the first argument is True.
  • The second argument of "or" is computed if and only if the first one is False.

   Note that short-circuit evaluation applies to any use of logicals. This is not true in Euphoria, where it only applies inside the conditions of if, elsif and while statements. The with RDS directive turns compatibility mode on and off in this respect as well. The directive with[out] all_short_circuit allows to toggle this setting independantly of the others.

4.3 Example code: finding an name in an address book.

   Assume Address is a record type that has a member called name, and that addrbook is a sequence of Address. Then

i=1
while i<=length(addrbook) and addrbook[i].name!="myname" do

i+=1
end while
if i>length(addrbook) then i=0 end if

will scan the address book for a record whose name member is equal to "myname". A value of 0 stands for name not found; else i holds the ordinal of the first occurrence of "myname" in a member of a record in addrbook.

   Without short-circuit evaluation, this code would fail if "myname" is not found, because addrbook[length(addrbook)+1] would be evaluated, causing an exception. In such a case, the code would be something like:

found=0
for i=1 to length(addrkook) do

if addrbook.name="myname" then
found=i exit
end if

end for

   So, an extra state variable is needed: even if i is available after the end for statement, a maximal value for i may mean that "myname" appeared as the last name or did not appear at all. The found variable is 0 on failure, and else means as above. And what if there was no exit statement?

4.4 Side effects.

   As routine calls are resolved first, they may affect the variables appearing in a condition.

   Further, it may be desirable to record the value of expressions that appeared inside conditions. Because of the short-circuit evaluation capabilities of OpenEuphoria, it is not always possible to compute the expressions prior to the condition evaluation, as this may raise exceptions.

   To address this situation, you can embed assignments in conditions, using the := form of the assignment operator.

   So:

if f0(a)=x:=f(a) and b=y:=g(a) then ...

will result in the following:

  • x will always hold the value of f(a), regardless of what happens next. The =f(a) assignment might have been taken out of the if statement, for better readibility.
  • if f0(a)=f(b), y will hold the value of g(a) at the time it was computed,thus taking into account the possible side effects of f and f0. It is not modified otherwise.

   An obvious use of this feature is to know why an if block was entered or not in the case of several clauses in the condition.


5 Variables.

   Variables are sets of data with enough logical links to be referred to by tags. These tags help identify this data for the program where they appear to act upon them. These tags are general_identifiers.

5.1 Properties of a variable.

   A variable has a number of attributes, or attached data that can be retrieved. They are called metadata, and are retrieved using the construct general_identifier@meta.

For variables, the available metadata are:

name
x@name is "x". Seems redundant, but see 5.4 below.
assigned
x@assigned is False if x never was assigned a value, True otherwise.
value
x@value holds the contents of x. Valid only if x@assigned is True. Provided for completeness only.
size
the number of bytes x occupies in memory. This is mainly useful for interfacing with other languages.
type
the routine id of the type checking routine assigned to the variable when it was declared.
deftype
the routine id of the common type of all elements of a nonatom. This is -1 for atoms.
id
an integer you can use as an alternate way to access x (see 5.4.
scope
a value that tells in which part of the program the symbol is defined. See 5.2 below.
format
a default format used to display the value of the variable. See format string specification in the entry for printf() in part C.
decl_mode
this is True if the variable was declared using new_var(), and False otherwise.
readonly
roNo for variables, roYes for locked variables, roConst for constants.
types
is a sequence of integers the length of the nonatom. Each integer is the routine_id of the type function of the matching element. For atoms, this sequence is empty.

   Only the value and format metadata can be directlly changed; the other attributes are read-only, or can be changed only through dedicated routines. Metadata also apply to RAM structures, but with slight vatiations.

   However, due to their special nature, the deftype metadata of raw RAM structures can be changed. See the specifics under specific meaning of metadata.

   A structure of all metadata a single symbol has can be retrieved using the get_meta function. The structure has the reserved type SystemVarMeta and has elements with the names and indexes as in the list above. The argument of get_meta is an expression evaluating to the id or name of the variable the metadata of which are requested.

   Formats are specified like for printf() use. See the entry for this function in the alphabetical part C. The @format is used only if it has another value than "", which it has by default.

5.2 Scope of a variable.

   A program is made of a main file (the one you feed the interpreter with) and zero or more auxiliary files. Named scopes inside files may exist (see Chapter 6, "Included files and namespaces".). Both are referred to as abstract files.

A symbol can be visible:

  • from more than one abstract file in the program
  • from the abstract file it lives in only
  • from only part of a single file

   As a result, the scope metadata has three possible values: sGlobal, sPublic and sPrivate, respectively.

   Symbols that have a different scope coexist together. But, at any given time, only one of them is referred by the name they share. This symbol is said to shadow the others.

   Private symbols shadow public symbols, and public symbols shadow global symbols without namespace.

   Clash between two symbols sharing the same name and both visible at some point is an error condition, since the interpreter does not know which one the general_identifier designates. Obviously, the error occurs only when the ambiguous symbol is used.

   The word "symbol" is purposely used here instead of "variable", because the notions above also apply to routines (see Chapter 8 "Routines").

5.3 Declaring a variable.

5.3.1 Type of a variable.

   Types in OpenEuphoria describe logical properties of values a variable may hold. There are four ways to declare a variable, and all but one require an explicit type:

  • declaration in a var-decl statement;
  • declaration by on-the-fly creatiion;
  • declaration as formal parameter of a routine;
  • declaration as a for loop index

   Only in the last case explicit typing is absent. But, from the values the three parameters of a for loop have, an integer or atom type is guaranteed.

Formally, there are four sorts of types in OpenEuphoria:

  • Built-in types do not require definitions; the language gives them away for free. They are listed in Chapter 1.
  • Nonatomic types built using built-in types
  • Used-defined types (see 1.3);
  • Compound types not entirely made from buillt-in types (see Nonatomic types).

5.3.2 Type of a nonatom element.

   Nonatoms rely on a default type, which is their deftype metadata. Elements of nonatoms may have any type, but they should pass the type checking thus defined. They are registered as having this default type.

   The programmer always has the option to specify the type of an element in a nonatom using the cast primitive. When this happens, a type check of the current element using the supplied type is performed, again regardless of current type checking status.

   A call to the cast primitive has the following form:

   cast(container,indexes,type)

   This procedure call acts on the nonatom container by name or id. It sets the type information for elements in container[indexes] to type. type is either a type name or the routine id of the type function. indexes may be any slice specification.

5.3.3 Declarations.

   A variable must be declared before being used. There are no exception to this principle but for loop indexes.

   A variable definition takes the following form:

   [global |static ]type {identifier[=value] }

, a type name followed by one or more items. These items are either variable names or name=value initialized variables.

   The variable's initial value is computed before the variable is created. This allows an identifier to shadow another one while retrieving the shadowed value at initialization time.

   The optional global keyword makes the variable(s) visible outside of their current abstract file, giving them a scope metadata of sGlobal. It is not allowed for private variables in routines.

   The optional static keyword applies to routine private variables. It makes their values persist between invocations of the routine.

   A declaration may appear in any place outside routines or blocks.

    Declarations in routines must be grouped right after the routine definition, as in:

function deloddnumbers(sequence of integer s)
integer i=1
sequence s0

--you can't move any of the two lines above past here.

s0=remainder(s,2)
while i<=length(s) do

if s[i]=1 then s=remove(s,i)
else i+=1     --it is easy to forget, but definitely necessary...
end if
end while
return s0
end function

   Additionally, since routine variables are private, you cannot declare them as global.

   Section 5.4 below will show you how to relax the restrictions above.

5.3.4 Constants.

   Constants are identifiers that are assigned a value at initialization time. That value cannot change hereafter. Using constants instead of hard-coded repeated values is recommended for two reasons at least:

  • VAT_rate may look more self-explanatory, when looking at the program souce for maintainance or debugging, than say 0.0825;
  • If the repeatedly hard-coded value is to change, all relevant instances of it must be changed in scattered places: this may prove a tedious, error-prone process. On the opposite, declaring that value as a named constant allows to change only one place in the code.

   Declaring a constant takes the following form, quite similar to a variable declaration:

[global ]constant {[type ]name=value}
Indeed, you can declare any number of constants in a single statement.

   This statement may appear everywhere a variable declaration is allowed. Contrary to variables, the type secification is optional, a type of object being assumed if it is not present.

   It may happen that a constant is declared with some value even though a constant with the same name and the same value is visible. In this case, the duplicate declarations are ignored; Euphoria throws an error in this situation. Note that a constant defined inside a routine cannot be global.

   Attempting to modify the value of a constant wil raise an exception. There is no way to change the value of a constant using only OpenEuporia statements.

5.4 Variable id's.

5.4.1 The id metadata.

   Rather than being referred to by its name, a symbol can be accessed through its id metadata. Routines will have routine_id's (see Chapter 8), and variables have variable_id's. One may consider that all variables are named elements of a large sequence, and the variable_id's are indexes into this sequence.

   When a variable is destroyed in any fashion, mainly because it is a private, nonstatic variable of a returning routine, its id is not recycled. This guarantees that an id always refer to the same variable or to no variable at all, which will cause an error on any kind of use.

   The built-in function isvarid takes an integer and returns True if this integer is the id of a variable and False else.

   Individual elements of nonatoms have variable ids, so that "s[3][5]@id" makes sense and returns an integer you can use as shown below. The id "follows" the element it tags during the transformations of the host array/sequence, so that the returned id may well give you the contents of s[2][7] if some elements were added or removed from s[3] or s.

5.4.2 Manipulating existing variables.

Five routines are provided to handle variables through their id's:

  • id(name) returns the id of the variable whose name name evaluates to. -1 is returned if no such variable exists.
  • get_var(id) returns the value of the variable with that id.
  • set_var(id,value) sets the value of the variable with that id to value.
  • var(id) returns the name of the variable with that id. For elements of nonatoms, their name is returned, or "" if none is applicable.
  • analyze_id(id) returns a sequence of object of length 3. The first term is the variable name, or element name, or "" for unnamed nonatom elements. The second element is the index of the element if applicable, or 0 else. The third element is the id of the parent if applicable, or id itself otherwise.
    Recursively calling analyze_id will yield the index sequence by which you can access the element with id id deeply nested in a nonatom.

   Note that set_var will fail if the symbol with this id is not to be written to ( var(id)@readonly != roYes ).

   Example: assume you have a variable named balance. Its value must be assigned to the variable credit if it is nonnegative, and to the variable debit if it is less than 0. You also want to print a message reflecting what has just been done. The printing format of credits may not be the same as for debits.

   A simple solution can be devised using the tools above:

baltype={var_id(credit),var_id(debit)}
...
b_id=baltype[1+(balance<0)]
set_var(b_id,balance)
msg=sprintf("Your %s is " & var(x)@format,{var(x),get_var(b_id)})

   In Euphoria (2.4 and before), you'd have to explicitly write an if statement to perform this admittedly simple task.

   Also note that, since variable id's are global, they can be used to access shadowed symbols or static private variables.

5.4.3 Creating variables on the fly.

   It may be useful to create variables in other places than in variable declarations, specially inside routines. This can be done as follows:

id=new_var(type,name,_)
id=new_var(type,name,value)

   This is equivalent to saying in the proper place "type name" or "type name=value", and gives you the id for this variable. Note the use of the anonymous placeholder '_' when no initial value is provided.

   Also note that variables can be created conditionally using this mechanism. You cannot new_var a global or static variable. A variable declared in this way is private if it is inside a routine and just public else.

   Creating a symbol clashing with an existing one, or accessing an id that does not exist, are error conditions, as might be expected.

5.4.4 Deleting variables.

   As new_var is primarily intended to create temporary variables, you may remove them once their short life span is over. This can be done as

   del_var(id)

   For obvious reasons, there are limitations to use such a tool:

  • You cannot del_var a symbol not declared with new_var();
  • The deleted symbol must not be shadowed at the time it is deleted. This condition can be restated as: id=var(id)@id.

5.5 Using a variable.

5.5.1 Variables and values.

   If the general_identifier of a variable appears on the left side of an assignment symbol (see 5.5.2 below), its value will be subject to change:

  • the righthand side of the assignment symbol is evaluated;
  • the type function associated to the variable may be called, with the resulting value as an argument.
  • if the variable can be written to, and if the above call returned a nonzero atom, the value becomes the new value of the variable. Otherwise, an exception is raised.

   Otherwise, the value of the variable is substituted to its identifier at run-time.

   If a variable is passed by reference to a routine (see 8.3), the routine will modify it only if it can write to it.

5.5.2 Assignments.

   A variable may be assigned a value using its id and the set_var routine, or using an initialization on declaration; but these are by no means the most frequent way of doing it.

   There are three ways to assign a value to a variable using assignment operators:

general_identifier assignment expression
#({general_identifiers}) assignment expression
#({general_identifiers})# assignment expression

   In the first form, a variable gets (modified by) the value on the right side. The second form allows this to take place on several variables at the same time, so that they are assigned, or modified by, the same value to which the righthand side evaluates.

   The third form normally requires the righthand side of the assignment to be a nonatom. The first element of the list on the left side of the second # is assigned the first element on the right side, and so on until one or both sides run out of elements. It could be called "desequencing", as it sends the contents of a sequence to several variables. If the righthand side is an atom, it is treated as a sequence of length 1.

   To retrieve only some elements from the righthand side in intermediate position (an element of higher index is retrieved), use the "_" universal placeholder where a variable would be expected. This effectively discards the value that would be in the assignment otherwise, so that that vaue is not even computed.

   As an example, assume seQ is a sequence with a last name, a first name and possibly a phone number. You want to get the last name, and the phone number if available. For some reason, you don't want to test the length of seQ. You can do the following:

#(name,_,phone)#=seQ

   name always gets seQ[1]. seQ[2] is never isnspected. If seQ[3] exists, it is assigned to phone, otherwise nothing happens.

5.6 Aliasing an element of a variable.

   You can specify an alternate name for an element of a nonatom. This is specially handy when complex index specifications are involved. The available tools are:

name aliased as alias
rename aliased as newalias
unname alias

name supplies an alternate name for an element in an array or sequence.
rename changes an existing alias to another one.
unname makes an alias unavailable.

   Aliases, in all this section, are identifiers, while aliased has the form identifier[index specification]. They act exactly as structure members do. As a result, an element keeps its name even if its position in the host sequence changes, as long as it exists. If a named element is removed from a sequence in any way, it is considered as unnamed. Any reference to a name that does not exist raises the UnknownToken exception.

6 Included files and namespaces.

   OpenEuphoria adopted the open philosophy of Euphoria in the sense that a lot of functionality is to be found in libraries rather than in the language itself. The main advantage is that anyone can customize or upgrade routines in the libraries easily - they are plain text source files -, rather than tinker with the OpenEuphoria interpreter/compiler source itself, which may be written in another language. The drawbacks are loss of performance, version conflicts and symbol clashes.

   Physical files the program will look for stuff into are called included files.

6.1 Namespaces.

   Because symbols from different files may share the same name and be visible from the same location in the program, there must be a way to unambiguously refer to any of them.

   Namespaces are the way. They are identifiers that prefix the symbol name. The prefix is separated from the raw symbol name by a colon ':'.

   Namespaces apply to global symbols only. By construction, there is only zero or one public symbol and zero or one private symbol to be seen from any given location in the program. However, global symbols do not necessarily harbor explicit namespaces. Global symbols are in the default namespace.

6.2 Including and naming a file: a first approach.

6.2.1 The include statement.

   Auxiliary files are made available to the main file using the include statement:

include filename|(expression)
include filename|(expr) as namespace

   Remember that a filename is eitker a string or a parenthesized expression whose run-time value is to be interpreted as a file name. The (generated) string is passed to the operating system as a filename as-is, and must conform to whatever syntax rules the OS enforces, like double quoting long file names with spaces in them.

   The simplest form makes global symbols in filename visible to the other files. This may lead to symbol clashes, some of them are caused by files the coder did not write. See sections 6.3 and 7 below.

   The second form allows using the prefix namespace: for global symbols in filename. Several filenames may share the same namespace.

   When a file is included for the first time, its directly executable statements are executed. Other subsequent include statements relative to this file do not trigger this action. A file is defined by its explicit path when supplied, or by a canonicalized path otherwise. As most OSes allow to specify a path in many different ways, you may include the same file several times as if it were for the first time, duplicating created symbols, hopefully with different namespace prefixes.

However, this is not to apply to links. These are aliases that are provided nativly by some OSes, and as third party addons by others. As the user generally sets these aliases voluntarily to make a file appear where it is not, links resolve to the true file name with the supplied path.

   include statements always declare namespaces: the default namespace is used even when none is supplied.

   The same file may appear with various namespaces in the same physical file. This is not really a feature, but legacy behaviour. On the brighter side, various files may include the same file with different namespaces.

   Using a string enclosed between parentheses causes the string to be considered as an expression, the evaluation of which is used as a filename.

6.2.2 Namespaces.

   Namespaces are a way for a given file to refer to symbol in another given file. As a result, namespaces are known only in the file they appear after an as keyword. So, they are two sorts of symbol clashes only:

  • clashes between symbols sharing the same explicit namespace: the coder is responsible for them and must alter his/her own code to set things right;
  • clashes between symbols without namespaces may originate from files the coder did not write. And (s)he included them in order not to rewrite them. Tools are provided for the coder to manage such conflicts between external libraries; see below.

6.3 The import, promote and demote statements.

   Because of the somewhat undiscriminating nature of the include statement, which has symbols appearing in two namespaces when one was specified, and which acts in the same way upon all symbols in the included file, another construct is needed to get a more controllable behaviour. Changing the rules for include would most likely break too much Euphoria code.

6.3.1 The import statement.

   The statement

import filename|(expression) as namespace

makes the symbols of filename appear in the namespace namespace. The symbols are not visible in the default: namespace, contrary to what the include statement does.

   A string immediately following import and enclosed in parentheses is an expression that must evaluate to a string. That string is then processed as a filenamme, just like it would for an include statement.

   Thus, import (misc.oe) as msc will look for a record called misc, with a member named oe, or a sequence misc with a named element oe. If this can be found and holds a string, this string is the filename to be imported.

   But import misc.oe as msc will look for a file called misc.oe and will make its global symbols visible in the namespace msc only.

   The discussion about "the first time" a file is included applies to the import and include statements collectively.

6.3.2 The promote statement.

   Because it is sometimes convenient or useful to use global symbols without using prefixes, it is possible to select symbols to be promoted to the default (unprefixed) namespace.

   The supported syntaxes are:

promote "{identifier}" from namespace
grant unprefixed access to symbols explicitly specified in the list.
promote identifier from namespace
identifier is assumed to be a sequence of strings, each of them being the name of a promoted symbol as above.
promote _ from namespace
promotes all symbols from namespace.
promote but list from namespace
promote all symbols but the supplied exclusion list. The list may have any of the first two forms above.

   Promoted symbols can then be accessed as if the file they come from had been included using the longer form of the include statement.

6.3.3 The demote statement.

   Promoted symbols can be demoted, which means they still exist in their source namespace, but no longer in default. The following syntaxes are supported:

demote " {identifier}" [from namespace]
demote identifier [from namespace]
demote _ from namespace
demote but "{identifier}" from namaspace
demote but identifier from namespace

   allow to drop unprefixed access for the symbols listed, in an almost symmetric way as promote adds them.

   Why "almost"? because the first two forms do not need to specify namespaces.

   Indeed, there is normally one symbol of each name in the default namespace, and there is no ambiguity in the command given, hence no systematic need for the extra argument.

7 Scopes.

7.1 Named scopes.

   The construct

[global ]scope identifier
... some code ...
end scope

allows to pretend that the enclosed code comes from an external file, whose name does not matter. Any code that may appear at some position in a source file may appear in a scope block at the same position.

A global symbol defined inside a named scope can be used:

  • using the identifier: prefix, as if they ame from another file;
  • without prefix; this requires the use statement (see 7.3 below).

   Whether a scope has a name or not, it basically remains a private area of code. To allow a named scope to be used, you explicitly enable it usin the use name statement (see 7.3 below).

   A scope declared as global can be considered as an external file, and is assumed to have been included using the include statement, with the scope name as a namespace. Alternately, a global scope is a scope that can be used from other physical files than the one it is in.

   For this reason, namespaces are said to be associated to abstract files, and not only to physical files.

7.2 Unnamed scopes.

   Unnamed scopes may appear in routines and, in this case, follow the same rules as code in routines.

   This is specially handy when a small part of a routine needs a few variables not needed elsewhere in the routine. For clarity of code, putting them inside a scope block helps separate them from the ones most likely to be used.

   A global symbol defined in an unnamed scope is visible from outside - more precisely, below - the scope in any scope it lies in, as you cannot use an unnamed scope. Thus, a global symbol defined in an unscoped, unnamed scope is just a filewide public variable (scoping and global sort of cancel out), while a global symbol defined in an unnamed scope inside another scope can be seen from the part of the outer scope extending from the end of tne inner scope to its own end. However, it can't be seen from outside the outer scope, unless this scope is named and used.

7.3 The use statement.

   The global symbols of a named scope can be accessed without any prefixing by issuing the use scopename statement.

   The use statement is always local, which means that its effects stop at the end of execution of code in the scope or routine it appears in.


8 Routines

   A routine is a piece of code which can be called and eventually returns. The first phrase means that control may be transferred to the first executable statement of the routine. This is not necessarily the first general-code statement, since variable initializations are executable statements. The second phrase means that, when the routine is finished with its work, it returns control to the statement following the calling statement.

   On return, a routine may provide a value, be it an atom or not. If the routine returns anything, the routine call is evaluated to the returned value.

   There are several keywords to define routines, because they have different history and role to play.

8.1 Defining a routine.

The definition of a routine involves:

  • optional attributes, which may be check, global or forward;
  • a routine type keyword, chosen in the following list:
    • routine
    • coroutine
    • procedure
    • function
    • type
    • reftype
    • handler
  • an identifier, which must not clash with a variable name;
  • a pair of parentheses enclosing a possibly empty list of formal parameters.

   Section 8.3 below describes formal parameters of routines.

8.1.1 Routine types.

   routine is the generic word designating a piece of code with its own variables that can be called and must return (unless it terminates some process).

   A coroutine has the same behaviour as a routine as far as return values are concerned. However, the statement that is reached on calling a coroutine is either the first one as usual, or the statement following the last yield taken in that piece of code, if it was not returned from the conventional way.

   A procedure is a routine which does not return any value.

   A function is a routine which must return a value.

   A type is a special sort of function. It must return a boolean, and takes exactly one argument. Specifying a type routine defines a user-defined type with the same name.

   A reftype is a special sort of function. It must return a boolean, and takes exactly two arguments: an integer, which is the id of the variable to be assigned, and a reference to the value to be assigned to it, so that the function can modify the value to be assigned. Specifying a reftype routine defines a user-defined type of the same name.

   A handler is a routine designed to handle events. These events are triggered at run-time, and handlers are not primarily meant to be called explicitly, even though they can as any routine. The argument list of a handler is described in 11.2 below.

   Additionally, when a variable is assigned, the (ref)type function associated to it is executed prior to the assignment, and an rxception is raised if it retuns False (see Chapter 1, "Types", for details). For this reason, (ref)types may be considered as an hybrid between functions and handlers.

8.1.2 Forward declaration.

   Sometimes, it makes the code clearer to use a routine even though it was not defined yet. In order to do this, you can use the forward attribute, followed by the routine definition.

   When time has come to code the routine statements, just issue the statement routine_type routine_name, and go ahead with the statements in the routine. The full definition of the routine is already known, so that this shorter form is enough. You still can give the full definition again, but an SyntaxError error will occur if there is a mismatch between the two definitions.

   Obviously, if a routine is declared forward and no flesh is added to this bone, an SyntaxError exception will occur at the end of the parsing of the source file.

8.1.3 Calling a routine.

An explicit routine call is made of:

  • the routine name;
  • a pair of parentheses enclosing a list of values, called arguments of the call.

   The list of values should conform to the list of types specified in the routine definition. For instance, if foo was defined as

routine foo(integer i,string s)

, then a call to foo must be like foo(expr1,expr2). expr1 is checked to be an integer, and expr2 is checked to be a string. If one of the checks fails, or if there are not exactly two arguments, an exception will occur.

   If the routine is to return a value, and if this value is to be used, the routine call (routine name, parentheses and everything in between) is replaced by the returned value.

   You may ignore the value returned by a function or routine by desequencing it to the empty list, as in #(_)#=foo(i,s). Calling a function as a procedure causes a SyntaxError error, as well as calling a routine as a procedure when it returns a value.

8.2 The return and resume statements.

8.2.1 Returning from a routine.

   To signal that a routine must terminate and return control to the statement logically following the routine call, use the return statement. return by itself just terminates the routine; return expr does so and returns the value of expr.

The concept of "statement logically following the call" is as follows:

  • if no value is returned, the statement logically following the call is the statement physically following the call;
  • if a value is returned and the routine call is not part of a compound expression, the statement following the call is the assignment of the returned value;
  • If the call belongs to a compound expression,the statement logically following the call is the next step in evaluating this expression.

    If a routine does not have an explicit return statement, return is assumed right before the end mark of the routine.

8.2.2 The yield statement.

   You may return from a coroutine using another statement than return or extended_return: yield. Returning using the yield statement records the location of the statement. Next time the coroutine is called, the statement following the yield just taken will be executed as first statement of the coroutine.

The state of the coroutine, which means the set of values taken by its private variables, is saved as well and restored on the next call. This allows an easy implementation of threads.

8.2.3 Resuming execution.

   The resume statement asks the routine to terminate and reexecute the statement which triggered the routine. For this reason, this statement is meant for exception handlers and, normally, should not be excuted when the routine is called explicitly. There is no point in returning a value here, so that the mention of a value to return is not supported.

8.3 Formal parameters of a routine.

Formal parameters are characterised by three properties:

  • a passing mode
  • a type
  • a name

8.3.1 Passing mode.

   A formal parameter is either a single variable name, a constant or a more complex expression like "x+1". "variable name" extends to whatever might have a variable id, like a named sequence element or record member.

   An argument which is not a variable name cannot be passed but by value.

   However, when an argument is a variable name, two actions can be performed. Proceed as above is a first option; the alternative is to let the variable be temporarily aliased by the associated formal parameter in the routine body.

   The first method is called "pass by value", and is the only method explicitly used by Euphoria. It is the default passing mode in OpenEuphoria.

   The second method is called "pass by reference", and must be explicitly enabled in the formal parameter specification.

   To allow passing by reference of a formal parameter, prefix it by the update keyword in the routine definition. When calling a routine, if the n-th argument is an expression other than a variable name and is supposed to be passed by reference, it will be passed by value instead. You can use the keyword byval just before the expression to emphasize that the effect of the update keywprd is temporarily suppressed.

   The with byref directive controls the way OpenEuphoria checks passing mode. When calling a routine while with byref is in force (this is the default behaviour) , if the n-th argument is an a variable name and is supposed to be passed by reference, it must be preceded by one of the keywords byref or byval; otherwise an exception will occur. The variable will be passed by reference if byref is used, and passed by value if byval is used. If the directive is turned off, neither keyword is mandatory, as the routine definition holds all the relevant information.

   Remember that, when a variable is passed by reference, any modification made by the called routine to the formal parameter to which this variable is mapped by the routine call is reflected to this variable. When passed by value, no modification is reflected, since the routine operates only on a local copy of the variable.

8.3.2 Parameter types.

   The type of a parameter must be explicitly stated; however, preprocessors may help reduce the typing by allowing type aliasing or completion..

   Specifying a formal parameter as array[(size)] or sequence allows to indicate that no specific type is expected for the elements of the nonatom passed. You may give a size to an array or just leave it out.

8.3.3 Example.

without byref   --to simplify things
forward function foo(string s,update string result,integer i)
...
seq="heLlo, woRld!"
...
x=foo(seq,append(seq,x),3)

--the second parameter will be passed by value regardless of the update keyword, as nothing can be mapped to this compound expression.
x=foo(length(s),s,j)
--ERROR: length(s) is an integer, and the first parameter --is a string.
x=foo(seq,seq,0) --this one is correct
...
routine foo --nothing more to say, here comes the beef.
...
result=lower(seq)
--after the correct call to foo, seq is "hello, world!",
--since result is the second parameter of foo, is passed by reference and
--seq is a variable name passed to foo as second argument, so that result
--aliases it.

8.4 Variables in a routine.

   Variables declared inside a routine are private and shadow any existing symbol with the same name. Formal parameters of routines have the same behaviour.

   A routine can access by name any public or global variable in scope at the time of the call, as well as its own private variables. A private variable cannot be accessed by name outside of its routine.

   On return from a routine, all its private variables cease to exist, except those declared as static.

8.5 Calling a routine.

8.5.1 The standard way.

   Explicit invocation of a routine takes the following form: the routine name, a left parenthesis, the possibly empty, comma separated list of arguments, and finally a right parenthesis.

   For instance, i=find("myself",someSequence) is a routine call with two arguments. First argument is "myself" and the second one is someSequence. It is a function-like call, since a value is retrieved from the called routine on return.

   The arguments must match the formal parameters in number and type. Failure to do so will raise a "ArgError" exception.

8.5.2 A special use of desequencing.

   You can use a sequence to represent several consecutive arguments in a routine call. To do so, the sequence must be prefixed by the # desequencing sign, and be enclosed in parentheses to avoid any risk of being mistaken for a hex number.

   Thus, if you want to issue the call foo(1,2,3,0), and you have a nonatom fooArgs at hand with the value {1,2,3}, you can issue foo(#(fooArgs),0) with precisely the same effect, which is to call the foo routine with the four arguments 1, 2, 3 and 0.

8.6 Dynamic invocation of routines.

Just like variables, routines have an index called routine_id. Five functions are provided to manage dynamic calls:

   Built-in routines, even though they are defined in no file, have routine id's which can be retrieved as for any user-defined routine.

8.7 Routines and namespaces.

   Global routines may be accessed by name when the abstract file they are defined in is included by other files. Any routine can be accessed through its routine id.

   OpenEuphoria provides quite a few built-in routines, like the length() function (counts the number of elements), the integer() type (retuns True on integers and False else) and so on. They are treated as global symbols, and also have a namespace of their own, called builtin.

   Defining a routine with the same name as an existing one in a given namespace (including default: or builtin:) generates a warning and shadows the preexisting routine with the newer one.

8.8 Routine metadata.

   The @ construct used for variables (see 5.1) is also available for routines, with a few restrictions however because some of them don't always make sense. The available metadata for routines are:

name
the name of the routine
type
a small integer representing the keyword used to define the routine. The recommended mapping is
  1. routine
  2. coroutine
  3. function
  4. procedure
  5. type
  6. reftype
  7. handler
id
the routine id of the visible routine with the given name.
format
meaningful for types and reftypes only. Default format to be used to print variables of this type. Th default value for this metadata is "". If so, or if the routine is not a (ref)type, the value is ignored.
types
a possibly empty array of integers, which are the routine_ids of the types of the formal parameters, in the order that they were enumerated at definition time.
scope
sGlobal or sPublic, as for variables.

   Unless stated otherwise, the meanings of the same metadata for routines or variables is very similar. The get_meta function is also available for routines (see 5.1) exactly as for variables, except that it returns a record of the reserved type of SystemRtMeta, with names and positions as in the list above.

8.9 Call chain management.

8.9.1 The call_chain function.

   Returns a sequence representing the calling chain that goes from the first routine call in the stack to the current statement; it does not include the call to call_chain() itself.

Each element in the returned sequence is either a string, which is a routine name or a file name, or a pair of strings. Each isolated string, or first string of a pair, represents the file scope or routine name the call took place in. The strings that appear second in a pair are labels attached to the calling statement, if any.

8.9.1 The extended_return statement.

   This statement allows to return from several layers of calls at once.

   The statement comes in two flavours: extended_return(levels) and extended_return(levels,value). If any value is returned, it is the second argument of the second form; the returned value is not inspected by any code in the routines that are being skipped in this way. For both forms, levels is the number of successive return statements to perform. Thus, extended_return() with a first argument of 1 is equivalent to the classical return, except that parentheses are mandatory.

9 Code blocks

   Code blocks are code between a blocktype statement and the matching end blocktype statement.

   Blocks are nested, which means that the order of the blocktypes and the order of the end blocktypes must be exactly the reverse of each other. Failure to do so causes an irrecoverable syntax error.

9.1 Labelling blocks.

   The label identifier statement may appear at any place in the code, tagging the following statement with the name identifier. However, the uses of this tag depend of the nature of the tagged statement. A label statement barely qualifies as a statement, as it is never executed.

Any label statement can be the target of a goto statement (see 9.8 below). Thus, the program execution point may be transferred to the statement following labal.

Any goto statement must have an accompanying label statement. This way, the target may be aware of whether it was reached by direct branching using goto or by a more normal kind of execution flow. identifier may then be retrieved by the come_from() function.

When tagging a code block header, like a for statement, the identifier can be used in instructions which control code block execution (see 9.4 and 9.5 below).

When tagging a statement containing a routine call, the identifier appears in the data returned by call_chain() when invoked at any point downstream in the call chain.

9.2 The if block

   Remember that this block may have a substructure:

if cond then

   general-code

[elsif cond then general-code]
[else general-code]
end if

   This will execute some general-code according to the values of the cond, according to the following rules:

9.3 The select block.

   When several courses of action might be taken according to the value of some expression, you can always stack a few elsif statements inside an if block. However, it may not be the clearest way to code this sort of situation, and this is why an alternative construct is provided. Also, you may want to take several branches in succession, which the if statement does not allow.

   The structure of a select block is as follows:

select expr

case statement
[otherwise general code]

end select

and a case statement is as follows:

case expr
general code
case rel_op expr
general code
case expr thru expr
general code
case condition
general code

9.3.1 The selector.

   The expression following the keyword select is evaluated, and this value is called the selector of the block. Decisions will be made according to this value. Each branch of the decision tree is represented by a case statement; an otherwise branch may be there as well.

9.3.2 The case statement.

   The keyword case may be followed by four different types of items:

9.3.3 Instruction flow inside a select block.

   Each case statement starts by (a simplified form of a) conditional clause. The code following a case statement is executed whenever the corresponding condition is true. After that, if the block was not exited, the next case statement is inspected.

   The process goes on until one of the three mutually exclusive situations happens:

9.3.4 The otherwise statement.

   This statement is optional, and is allowed only inside a select block as its last sub-block. It may appear at most once in a block.

   If it is reached, and one of the branches of the select block was taken, the block is exited; otherwise, execution continues past the otherwise statement.

9.3.5 The stop statement.

   Allowed only inside a case branch, it causes that branch to be exited and the next case condition to be tested. If there is none left, the select block is exited.

9.4 Loops.

9.4.1 The for loop block.

   The complete syntax is as follows:

for index=start value to end value [by increment] do

general code

end for

   When the for statement is reached from outside the block, start value, end value and increment are computed. If the by clause is not present, increment is set to 1. They all must evaluate to atomic values. These values are not computed again during the subsequent loop iterations.

   The loop index variable index must not have been declared, and is assigned start value. It cannot be modified.

   If (start value-end value)*increment is greater than zero, no iteration is performed and the loop is exited, as there is no way for the index variable to get closer to end value. Otherwise, the first iteration starts.

   If the index is not between the start and end values, and if the for statement is reached from inside the loop, the loop is exited without any further iteration. Execution resumes right after the end for statement; otherwise, a new iteration starts. On exit, the loop index variable remains available until the next for statement using the same identifier as its index variable. In Euphoria, the loop index vanishes otside toe for loop it was defined in.

   When the end for statement is reached, the index is incremented by the increment and control is transferred to the for statement.

9.4.2 The while loop.

   while cond do general-code end while

   Executes an iteration of the loop if cond is true. Otherwise transfers control right after the end while statement.

   The end while statement just causes the while statement to be executed again.

9.4.3 The wfor loop.

   The complete syntax is as follows:

wfor identifier=start value to end value [by increment] do

general code
end wfor

   This loop is an hybrid between a for and a while loop, hence the wfor name.

   If the by clause is not present, increment is set to 1. These values are computed whenever the wfor statement is executed, and always must evaluate to atomic values.

   The loop index variable identifier must have been declared. It is an ordinary variable which may be assigned inside the loop.

   If the index is not between the start and end values, the loop is exited without any (further) iteration. Execution resumes right after the end wfor statement. Otherwise, a new iteration starts.

   When the end wfor statement is reached, the index is incremented by the increment and control is transferred to the wfor statement.

9.5 Exiting blocks.

   Exiting a block means that the next executed statement is the one following the end blocktype statement which ends the block.

   A code block will be said to be "active" relative to this statement if it contains the statement.

9.5.1 Exiting keywords.

   The exit statement can be used to exit a loop, the exif statement can be used to exit an if block, and the break statement allows to exit a select block.

   They all have an optional argument. If they don't, the current relevant block is exited. Otherwise, the specified block (see below) is exited.

9.5.2 Optional argument for exiting keywords.

   The optional argument of an exiting keyword is either a number or an identifier. The phrase "relevant block" translates to "loop block" when referring to an exit statement, an if block when referring to an exif statement and a select block when a break statement is involved.

   If the argument is an identifier, it must be a label tagging an active relevant block. Labels are dropped using the label statement in 9.1 above. Failing this consistency criterium raises an exception. Otherwise, the block tagged by this label is exited.

   If the argument is an integer greater than zero, this number is the number of relevant blocks nesting the current one that must be exited. Thus exit 1 means "exit the active loop above the current one", exit 2 exits the loop above the one above the current one, and so on.

   If the number is negative, then the active relevant blocks above the current one are counted backwards from the top to determine the block to be exited. Thus, exit -1 means "exit the topmost active loop block", exit -2 means "exit the active loop just below the topmost one", and so on.

   An argument of 0 is ignored, as it would only emphasize that the current relevant block is to be exited.

9.6 Iteration control for loops.

   A loop iteration can be stopped at any point during its execution using the keywords next or retry. These keywords accept the same kinds of optional argument as exit.

9.6.1 The next statement.

   This statement causes a new iteration of the loop to occur. This means that control is transferred to the opening statement of the loop, causing index update in for or wfor loops, and condition evaluation in a while loop.

9.6.2 The retry statement.

   This statement causes the current iteration of the loop to start again. This means that control is transferred to the first statement inside the loop block. Thus, the index of a for or wfor loop is not updated, and the condition of a while loop is not evaluated.

9.7 Scope blocks.

   Described in chapter 7, they are also blocks and follow the general rule about nesting: a block can't end outside a block inside which it starts.

9.8 Navigating between blocks.

   On top of all constructs above, which allow for an orderly yet easily managed execution flow, OpenEuphoria provides another tool to perform tasks the above would not allow to perform easily: the goto statement.

Computer science experts have fought over the value of goto as a statement in a high level language. Using it too much certainly leads to a hard to follow program execution flow, which makes maintenance and upgrading all the harder. The goto statement is probably more useful in rapid development stages than in production code, even thugh sparse and relevant use can really optimize a few things.

9.8.1 The goto statement.

   The statement goto identifier causes program execution to resume at the statement that immediately follows a matching label statement in the current routine or file scope.

   A goto statement must be preceded by a label statement. This will allow the target to be aware that control was transferred to it using a goto statement. Failing to label a goto statement, or branching to an unavailable label, causes a runtime error.

9.8.2 The come_from function.

   This function takes no argument and returns the label attached to the last executed goto statement as a string. Most of the time, it is important for some statement to know it was reached by a goto rather than through a more conservative flow control command like if or while.

9.8.3 The come_back statement.

   Reverts the effect of a goto statement by tranferring the execution point to the statement following the last goto taken.

9.8.4 The goto_clear function.

   This function clears the internal variable holding the label of the last goto taken, so that a subsequent come_back does not have unintended effects.

9.8.5 Deep sea navigation.

   For certain special purposes, it may be legitimate to perform a far jump to a label in another namespace. This is done by goto_far(namespace,label). Just like the less far-reaching goto, it must bear a label. The equivalent of the other two statements are come_from_far(), come_back_far and goto_clear_far(). Their description is identical to those in 9.8.2 to 9.8.4 above.

10 The built-in debugger.

   A run-time debugger makes it extremely easy to debug a program, much easier at least than scattering a few print() statements and having to guess what is going wrong in program flow, variable assignments and other issues.

   The integrated debugger is enabled by the with trace statement, and completely turned off by the without trace statement. This default behaviour saves execution time.

   If the debugger is enabled, you start it by the trace(1) statement, and turn it off from the running program by the trace(0) statement.

   A command/status line will be also provided, as the debugger may process user input (see 10.2 below) and display some information to the user.

10.1 Debugger screen.

10.1.1 General description of the debugger screen.

   The main debugger screen shows about 15 lines of code in 25 line console displays, more if console displays more lines, highlighting the one to be executed. This line will remain about the middle of the screen most of the time, so that some code before and after it can be seen always. It will be called the active line. Another line may be highlighted in some other way, and will be called the spot line.

   Another part of the screen is reserved to show the values of most recently accessed variables. These values are updated as source statements are executed.

   The debugger must be implemented in such a way that it will not trace itself, nor trace events it may (cause to) trigger.

10.1.2 Available keystrokes.

The following actions should be requested using one-key keyboard shortcuts:

10.1.3 Other commands.

   Scrolling through code using the mouse buttons, movements or wheel actions is to be provided.

10.2 Debugger commands.

   Rather than immediate actions, the following are commands aimed at inducing specific behaviour from the debugger, or to set some trace scheduling.

10.2.1 Dynamic breakpoint.

   The b command allows you to enter a conditional expression. This expression must be a valid OpenEuphoria condition. This condition sets up a dynamic breakpoint, which is triggered any time the condition is true. The expression may use any variable in scope at the time it is defined. Whenever one of these variables gets out of scope (for instance, returning from a routine), the dynamic breakpoint is disabled.

   This breakpoint is independent from the static brakpoint F8 toggles on and off.

10.2.2 The ? command.

   The ? command allows you to enter a valid OpenEuphoria expression. This expression will be treated as a most recently modified variable and displayed as such.

10.2.3 The s command.

   The s command will prompt you to enter an OpenEuphoria expression. if this exoression is not among the displayed variables, it is added as the ? command would. Moreover, a text box will open up and display a good deal of the expression value, quite more than ? would have allowed. The S command (10.1) closes the box.

10.3 Status report.

   The status line referred to in 10.1.1 will display information about the state of the static and dynamic breakpoint, as well as the indication of which one was last triggered.


11 Event trapping and exception handling.

   Exceptions are situations which most likely arise from an error. Exceptions will cause default or user-defined routines - handlers actually - to be executed when available. This way, the program knows that something possibly went wrong and may take corrective action as needed to avoid or soften the crash. An exception is generated by hardware, which signals that something is amiss - no memory at this address, invalid floating point number, stack overflaw, whatever -, and software may take action to recover, or stop processing in the most graceful way possible.

   Events are actions taken by the machine code being executed. Trapping them, also using handlers, allows to be informed of what is going on. Such hooks are of obvious use for debugging or profiling purposes, but they may serve many more useful programming needs as well. Events ae not triggered really; they are reports that some action, like calling a routine or reading a variable, is being taken. This signal may not be listened to, or be so in a limited number of cases.

   Because the same mechanism is used in both contexts, the term of event will be used to refer to both exceptions and program events indifferently. The underlying physical architecture of the machine on which OE is running, or the design of its operating system, may change which exceptions or events are processed in software only or through hardware. As OE strives to be fully cross-platform, these details are supposed to be hidden from the user. In the event of exceptions to this principle, they will have to be fully documented in release documents.

A last note: code generated by most programming languages do trigger a lot of events. Most of the times, there is no way the software can hook the events and steer away from the default action, which may not be the most sensible or efficient in a given case. In line with the principle of openness, OpenEuphoria aims to provide total control, including in those situations that might go awry fast if the right move is not taken at the right moment.

11.1 Assigning a handler to an exception.

   This is done using the statement set_handler(id, event). id must resolve to the routine id of a handler. event must resolve to a string representing an event name.

   get_handler(expr), where expr resolves to an event name, returns the id of the handler for this event.

   A handler is called with five parameters:

11.2 Events.

   The following table lists the events and exceptions that call a handler, the parameter they pass as a last argument and the default handler action.

Name Last parameter Default action
AfterAssign {} does nothing
AfterRead value returns the value
AfterReturn {value}, or {} if none does nothing
AfterWarning {warning text,warning code} does nothing
ArgError {argument #,value} aborts
BeforeAssign value calls type checking code, possibly issuing TypeError
BeforeCall argument list checks types, possibly issuing ArgError
BeforeExecute statement text does nothing
BeforeIndex {} conveerts index to positive and checks if it is allowable, possibly issuing IndexBounds
BeforeRead {} does nothing
BeforeWarning {warning text,warning code} displays the warning
ExternalOverflow {max size,address} aborts
IndexBounds value aborts
MathIndeterminacy argument aborts
RaisedError error message prints the line # and the supplied
error message, then aborts
RuntimeError {statement text,error code} aborts
StackOverflowstack sizeaborts
SyntaxError statement text aborts
TypeError value aborts
UnknownToken statement text aborts
ZeroDivide {} aborts

   Here is a more detailed account:

   By default, events are all disabled for maximal performance. The with events directive allows to enable or disable any event or event pair at will.

More usage notes:

11.3 The error procedure.

   This procedure takes a string as its argument, and passes it, as xell as the usual four other arguments, to the RaisedError handler. Both second and third arguments are zero.

11.4 The resume_execute() and return_execute() primitives.

   It may be desitable, when a resume and return instruction is executed, to execute a dynamically generated statement after leaving the handler, but before the standard action being taken. The argument for resume_execute and return_execute is an expression which is fed to execute at the appropriate time.

   Example: assume that a string is being scanned, and its length may vary in the process. A while or wfor loop may do the trick, except that, after almost any statement modifying the scanning index or the length of the string, a check must be performed to avoid index out of bounds exception.

   A clean solution then is to instruct the relevant handler to quietly exit the loop whenever this condition happens.

   Thus, one may code:

IOBhandler=get_handler("IndexBounds")
handler IndexBounds(integer event,integer varid,integer index,index lineno,pdate object vAlue) if varid=scanned@id then return_execute("exit")

--when scanned is subscripted with an out-of-bounds index, just exit

else call_proc(IOBhandler,{event,varid,index,lineno,byref vAlue})

--otherwise, chain to previous handler.

end if
end handler

--now the loop
i=1
while i<=length(scanned) do
...
--code that no longer needs repeated checks like --"if i>length(scanned) then exit end if"
...
end while
set_handler("IndexBounds",IOBhandler)    --restore previous handler

   The code inside the loop got rid of repeated checks and is clearer and leaner as a result. There is hardly any performance loss, since the handler is invoked only on an error condition. As the index checks will be performed anyway, repeating them in code is sheer waste of CPU cycles, as they are mostly useless.

11.5 Error reporting.

   When an error causes an OpenEuphoria program to abort, it generates a file holding the values of all variables, as well as an error message stating the error, where it happened and, whenever possible, a traceback of all calls that led to the fatal error.

   Additionally, a message is sent to stderr. The default messsage is made of the header and traceback part in the error file.

   By default, the error file generated is called "oe.err". You can speccify another relative or absolute file name using the call crash_file(newFileName).

   You can replace the default message with some of your own by calling crash_message(newMessage).

   You may also want to apply some processing to the crash message, whether it is the default one or not. You can do so using crash_process(id). id is the routine_id of a routine that will take a string (the current crash message) as a passed by reference argument and processes it (the default action is to return immediately) before display.

12 Dynamic code execution.

   OpenEuphoria llows you to use strings to hold expressions or code to be executed.

12.1 The eval function.

   The eval function takes a string as argument. This string must be a valid expression: eval evaluates it and returns its value. Thus:

s1="3+"
s2="length(s)"
x=eval(s1 & s2)
will assign 3+length(s) to the variable x, unless s is not a declared nonatom, in which case an error occurs.

12.2 The execute procedure.

   This procedure takes a string as argument. The string must evaluate to valid OpenEuphoria code. This code is then executed as if it were hardcoded at the position of the execute procedure call.

   This procedure may not be supported by all compiled or translated versions of OpenEuphoria, as providing support for this capability might prove particularly tricky or inefficient in these contexts.

13 External OOP support.

   Object oriented programming, or OOP, is not directly built in OpenEuphoria. External libraries will get notifications of OOP syntactic constructs and will have to implement these constructs.

   The OOP library is to be included in the reserved OO namespace. Normally, Functions of the library are not called directly; the interpreter plugs in the appropriate OO calls, like a preprocessor would.

13.1 Recognised constructs and their translations.

ActionOE syntaxTranslation
Starts a class definitionclass Identifier constant Identifier=OO:begin_class()
End a class definition:end classOO:end_class()
Declares a private part of a class: private doOO:begin_private()
Ends a private part of a class: end privateOO:end_private()
Declares a public part of a class: public doOO:begin_public()
Ends a public part of a class: end publicOO:end_public()
Declares a protected part of a class: protected doOO:begin_protected()
Ends a protected part of a class: end protectedOO:end_protected()
Apply a method to an object Identifier1->Identifier2({expr}) OO:call_method (Identifier1,Identifier2,{{expr}})
Get a member from a class: identifier1->identifier2OO:get_member(identifier1, identifier2)
Set a member from a class identifier1->identifier2=expr OO:set_member(identifier1,identifier2,expr)

14 Hello, outside World!

   Even though OpenEuphoria has very specific features and has a more abstract definition of data types than most other usual languages, it is able to interface with RAM structures, external files or devices, and compiled libraries from other languages.

14.1 External files and devices.

14.1.1 I/O channels.

   OpenEuphoria programs access files or devices using handles, or channel numbers. These integers are required by almost all communication functions.

   The three lowest possible values for channels are reserved: 0 is the standard input (usually, the keyboard), 1 is the standard output (normally, the console) and 2 is the standard error (normally, the console also). Redirection is handled by the host OS and not by the language.

14.1.2 The open function.

   Associating a channel to an external file or device is done through the open(name,mode) function. name is a file or device name string passed to the OS and assumed to be recognized and duly processed by it. mode is a string taken from the following list:

   In addition, you may add the "b" modifier to access files as binary rather than text ("rb" accesses a binary file for read only and so forth). Text files are organized in logical lines, separated by a format specific marker, and are supposed to hold values mapped to printable characters; binary files don't know about lines and may hold any kind of binary values.

   A returned value of -1 means the association was not possible for a variety of reasons (file not found, access denied, device busy, unsupported mode, ...).

14.1.3 Sending and receiving.

   Once an I/O channel is defined, you can read from and write to it. If it supports random access, you can position a channel pointer to select a place to read from or write to.

14.1.3.1 Reading from an I/O channel.

   The following functions read from a file or device, updating the channel pointer when applicable:

getc(channel)
read next character
gets(channel)
read next line
get_bytes(channel,number)
read number bytes
get(channel)
read the next text-converted OpenEuphoria object
get_screen_char(row,column)
reads the screen at line row, column column, and returns a pair {ASCII code,attribute}.
get_key()
returns key code of last pressed key
wait_key()
waits for a key press, returns key code
prompt_string(message)
prompts for a string, displaying message
prompt_number(message,bounds)
prompts for a number, displaying message and checking input with bounds
14.1.3.2 Writing to I/O channels.

   The following procedures write to a file or device, updating the channel pointer when applicable:

print(channel,something)
writes something to channel
printf(channel,format,something)
writes to channel the format string format where format specifiers are replaced by printouts of the values in something.
pretty_print(channel,something,options)
same as print, but with more options
put_screen_char(row,column,something)
writes something to screen starting at line row, column column. something must be a flat sequence of even length, as it is interpreted as alternating ASCII codes and attributess.
puts(channel,something)
writes one or more bytes to channel. If something is a nonatom, it must not hold nonatom itself.
flush(channel)
flushes all file buffers related to channel. Normally, writes are buffered for performance.

   When a logical line is read from a channel, the OS specific line terminator is removed and replaced by a \n (ASCII 10) logical terminator. The reverse operation is performed on output. If no line is available from channel, -1 is returned.

   ?something is a shorthand for pretty_print(1,something,SystemPPOptions).

14.1.3.4 Channel access management.

   You may suspend, resume or stop access to a channel:

lock_file(channel,mode,bounds)
restricts or prevents concurrent access to channel. mode is an integer some OSes support to allow more fine tuning of the lock. bounds allows to lock part of a channel only; not all OSes support it. Returns 1 on success, 0 on failure.
unlock_file(channel)
ends the effects of a previous lock_file on channel.
close(channel)
close channel. Any subsequent attempt to access it will result in an error condition.

14.1.5 I/O channel pointers.

   Some channels have a pointer, which you can get or set, that controls where the next read or write will occur:

where(channel)
get the pointer for channel
seek(channel)
set the pointer for channel.

14.2 RAM management.

   OpenEuphoria has some types which have no equivalent in other languages, or may have to read data interpreted in different ways by other languages. Besides access to raw memory, OpenEuphoria recognizes main datatypes from other languages and can freely arrange them into raw structures, which are memory areas organized as standard structures.

14.2.1 Accessing memory.

   OpenEuphoria provides the following functions and procedures:

peek(address)
returns the byte at address
peek(parms)
returns a sequence of bytes. parms is a pair of values: the first one is the starting address and the second one the number of bytes to read.
peek(1|2|4|8|16)(s|u)(address|parms)
as above, but bytes are replaced by (signed) bytes, words, dwords, qwords or 128-bit chunks. The "s" modifier stands for "signed", and "u" for "unsigned". The 2, 4 or 8 modifier refers to words, dwords or qwords in an obvious way.
poke(address,values)
Writes length(values) bytes to memory starting at address. values is a sequence of integers whose less significant byte only is written. If values is a single integer, it is converted to {values} first.
poke(2|4|8|16)(address,values)
as above, but bytes are replaced by (signed) words, dwords, qwords or 128-bit chunks. The "s" modifier stands for "signed", and "u" for "unsigned". The 2, 4 or 8 modifier refers to words, dwords or qwords in an obvious way.
mem_copy(target,source,length)
Copies the memrory block of length length starting at source onto the block of same length starting at target.
mem_set(start,value,length)
Sets the length bytes starting at start to the less significant byte of value.
mem_set4(start,value,length)
Sets the length dwords starting at start to the less significant dword of value.
allocate(size)
Sets aside a memory block size byte long and returns its address, o 0 on failure.
allocate_string(string)
Creates a buffer holding remainder(string,256) & 0 and returns its address, or 0 on failure. The buffer size is length(string)+1.
free(address)
Frees a previously allocated block at address.
register_block(address,size)
Registers the block starting at address of length size to the memory monitoing routines in the safe.oe debugging library.
unregister_block(address)
Unregisters a block starting at address that was previously registered using register_block.

14.2.2 External types.

   Non OpenEuphoria data is known by its physical properties only. The following describes predefined and general external types. Although they are named "types", you cannot use them outside the context of structures in RAM without causing an error.

14.2.2.1 Predefined external types.

   (Open)Euphoria uses the following predefined types to qualify data sent to external routines. The table below also mentions the translations, when they exist, in term of general external type, described in the next section.

Name Meaning Value
C_CHAR signed byte #0100_0001
C_UCHAR byte #0200_0001
C_SHORT signed word = signed byte2 #0100_0002
C_USHORT unsigned word = byte2 #0200_0002
C_INT, C_LONG signed 4 byte integer = signed byte4 #0100_0004
E_INTEGER signed 31-bit integer #0600_0004
C_UINT, C_ULONG, C_POINTER unsigned dword = byte4 #0200_0004
C_XLONG signed qword #0100_0008
C_UXLONG unsigned qword = byte8 #0200_0008
C_FLOAT single precision FP number = float32 #0300_0004
C_DOUBLE double precision FP number = float64 #0300_0008
E_ATOM Euphoria atom #0700_0004
C_STRING ASCIZ string = bytexx, last byte is 0 = delimited(0) byte #0800_0001
E_SEQUENCE Euphoria sequence #0800_0004
P_STRING first byte is the number of remaining bytes = counted1 byte #0801_0001
P_XSTRING first 4 byte is the number of remaining bytes = counted4 byte #0801_0004
E_OBJECT Euphoria object #0900_0004

   The value in the third column is actually aliased by the name in the first column. OpenEuphoria supports all external types Euphoria supports, plus 64-bit integers and Pascal/Ada counted strings.

14.2.2.2 General external types.

   OpenEuphoria also allows to describe a data type by mere physical properties. The general form of a general type descriptor is as follows:

[counted(number)|delimited(list) ][[un]signed ]byte(number) (number)
with the following meaning and destination:

   For instance, C_STRING is described by delimited(0) byte, while a P_XSTRING is a counted4 byte. An array of signed 64-bit integers of length 17 will be signed byte8 (17). If X_STRING is a C_STRING that may be terminated also by two -1 in a row, then it is described as delimited(0,{255,255}) byte.

   You can use predefined or general external types indifferently when they both exist.

14.2.3 RAM structures.

   OpenEuphoria allows to treat blocks of memory like ordinary structures, except that they have special types and can't hold ordinary types.

   Declaring an ordinary or external structure is done in a very similar way, except that structure is replaced by memory. There are other less visible differences too.

14.2.3.1 Use of context values.

   There are no context types as for structures, but lengths can be expressed using other fields of the structure; the dot syntax is recycled for this purpose.

   Thus, the following declarations

memory color(byte R,byte G,byte B) end memory
memory colors(
    C_LONG nbitems,
    color col_array(.nbitems)
end memory
defines a color RAM structure as an RGB triple of bytes, and a colors RAM structure that starts by an unsigned dword, followed by an array of that many color structures, called col_array.

   Any expression may be used, not only other fields of the structure.

14.2.3.2 The optional attribute.

   Some structures come in various flavors, with the existence of some fields depending on the value of other fields, or of other factors, like an OS version. To address this situation, create a memory with maximal number of fields, and declare some of them as optional, so that partial structures of it may qualify as it.

   For instance, the following declaration:

memory city_info(
C_STRING name,
C_STRING zipcode,
optional C_LONG population,
C_FLOAT area_acres
optional C_FLOAT average growth
end memory
actually defines three different structures that all qualify as a city_info. The shortest kind only has the name and zipcode fields; the intermediate kind has all fields but the last, and the longest kind has all fields in the description.
14.2.3.3 Bitwise rotate.

   As the physical size of a memory field is unambiguously known, it makes sense to apply a bitwise rotation to it.

   The call rotate(mem,field,n) will rotate the field field of the memory instance mem n position to the right. Use negative values of n to induce rotation to the left.

14.2.3.4 Specific meaning of metadata.

   The deftype metadata is meaningless for structures, and is set to 0. For memorys, it is set to 0 by default, but can be changed. If a memory type has optional fields, you can specify the number of optional fields you don't want to be taken into account.

   For instance, in the city_info example, city_info@deftype may be set to 0, 1 or 2. The shortest form corresponds to the value 2, the intermediate form to 1 and the longest default form to 0.

   The id metadata doesn't make sense for memory instances, as they are not variables. For this reason, the id metadata of a memory instance holds its physical address instead. This allows to read from or write to a memory using peek and poke, as well as the other routines described below.

14.3 Converting from and to OpenEuphoria types.

   As OpenEuphoria types don't have a fixed size, writing and reading from memory must be done using speciic routines.

atom_to_float(32|64)(float)
Converts a floating point number float to a sequence of 4 or 8 bytes, representing an IEEE 32- or 64 bi floating point. The resulting sequence my be poked as is.
float(32|64)_to_atom(seq)
Takes a sequence of bytes of length 4 or 8 representing an IEEE 32- or 64-bit floating point number and converts it into an atom.
int_to_bytes(some_int)
Returns a sequence of bytes large enough to accomodate some_int. Its length is a multiple of 4 on 32-bit systems, and of 8 on 64-bit systems.
bytes_to_int(seq)
merges the bytes in seq into a single nonnegative integer, which is returned.

   Additionally, assigning to and from memory fields is done using the = sign. When some OpenEuphoria expression has a result that doesn't fit the field it is written to, the ExternalOverflow exception is raised. It may take corrective action, including emulating infinite values etc.

14.4 Interfacing with external code.

   OpenEuphoria programs may access routines and data from external files or processes, and share their own routines and data with other processes.

call(address)
Calls machine code located at address. When it returns, using the retn (#C3) opcode, control is passed to the next program statement.
open_dll(libname)
Opens the library (dynamic link library or shared object) libname and returns an addres (a machine integer). This address is 0 on failure.
define_c_func(entry_point,name|{'+',name},args,return_type)
Gets a routine_id for an external function. entry_point is the address returned on opening the source library, and name is the function name. args is a list of external types for the formal parameters, and return_type is the external type for the returned value. Use the + sign to request a __cdecl call; the default is __stdcall convention.
define_c_func({},address|{'+',address},args,return_type)
Gets a routine_id for an external function known by its address in memory. args is a list of external types for the formal parameters, and return_type is the external type for the returned value. Use the + sign to request a __cdecl call; the default is __stdcall convention.
define_c_proc(entry_point,name|{'+',name,args)
Gets a routine_id for an external procedure. entry_point is the address returned on opening the source library, and name is the function name. args is a list of external types for the formal parameters. Use the + sign to request a __cdecl call; the default is __stdcall convention.
define_c_proc({},address|{'+',address},args)
Gets a routine_id for an external procedure known by its address in memory. args is a list of external types for the formal parameters. Use the + sign to request a __cdecl call; the default is __stdcall convention.
define_c_var(entry_point,name)
Gets a address for an external variable. entry_point is the address returned on opening the source library, and name is the function name.
c_func(id,args)
Acts as call_func, but used only on functions defined using define_c_func.
c_proc(id,args)
Acts as call_proc, but used only on proedures defined using define_c_proc.
call_back(id|{'+',id})
Returns an address for the routine in your code with routine_id id, so that external cod can call it. It must have 9 arguments at most, and all arguments, as well as any returned value, must fit into machne size integers.

15 Interaction with the OS.

15.1 File system calls.

   The following allows you to directly inspect or change the file system tree:

current_dir()
Returns the name of the current directory as a string.
chdir(path)
changes current directory to path and returns a success code
dir(dirspecs)
returns a sequence of structures of reserved type SystemDirEntry, each of the latter refers to one directory entry matching dirspecs. It holds name, attributes, size, date and time.
walk_dir(path,filter_id,recurse)
Scans path for directory entries and hands them in turn to the function with routine_id filter_id. Subdirectories of path are scanned recursively if the boolean recurse is True. Returns a nonzero exit code on failure, and 0 on success.

15.2 Graphics.

   OpenEuphoria slightly breaks compatibility with Euphoria here, as DOS-specific calls are not supported.

video_config()
Returns an array of integers: current color, mode, #rows, #columns, #horizontal ixzlq, #vertical pixels, #supported colors, #pages.
text_color(color)
Sets the color to display any subsequently displayed text with.
bk_color(color)
Defines the color of the background for further displayed text.
clear_screen()
clears the screen to the current background color.
text_rows(nblines)
Returns the number of lines of text on screen, trying to set it to nblines.
scroll(amount,start,end)
Scrolls the lines from start to end inclusive by amount (which can be negative).
position(coord1,coord2)
Sets cursor position.
get_position()
Returns cursor position.
wrap(bool)
Lets long lines wrap (bool=True) or have them truncated when they reach the right edge of the screen.
display_text_image(position,image)
Displays a combination of characters and attributes, image, at position.
save_text_image(ULcorner,BRcorner)
Saves a rectangular zone of screen, specified by the position of its upper left and bottom right corners, as a sequence in the format expected by display_text_image.
read_bitmap(filename)
Reads the bitmap file filename into a pair consisting of a palette (the colors used) and a sequence of sequence of pixels, each of which being coded by the index of its color into the palette.
save_bitmap(bmpdata,filename)
Creates the bitmap file filename and saves bmpdata to it so as to represent a valid bitmap. bmpdata has the format of the value returned by read_bitmap. Returns specific status codes.

15.3 Other screen-related calls.

   The following functions are also provided:

mouse_pointer(yesno)
Hides or shows the mouse pointer according to yesno being False or True.
mouse_events(mask)
Sets the list of events get_mouse will monitor. mask is an integer, sum of the powers of 2 representing desired mouse events.
get_mouse()
Returns a triple {event,x,y} representing the latest reported eligible mouse event and where it occurred, or -1 if none.
cursor(hexvalue)
Sets cursor shape.
message_box(message,title,buttons)
Displays message box in a window titled title, displaying message and other elements specified by buttons.
free_console()
Closes the displayed text mode window.

15.4 Process control.

   You have some control and infrmation on the current process:

allow_break(yesno)
Allows program termination using control-C or control-Break if yesno is True, and disallowing it otherwise.
check_break()
Returns the number of times control-C or control-Break was pressed since last call, and resets the counter.
sleep(seconds)
Suspends current thread for seconds seconds. seconds need not be an integer. The value is rounded up to the maximal resolution the OS appears to support.
abort(exitcode)
Terminates program and returns exitcode to the parent program or OS.
instance()
Returns the program handle under Windows, and 0 under other OSes.

15.5 Other generic OS services.

   OpenEuphoria also provides a vaiety of system calls, including generic execution of shell commands:

sound(frequency)
Turns the PC speaker on at the given frequency if nonzero, or turns it off. A beep is heard under DOS only.
time()
Returns the number of clock ticks elapsed since some fixed point.
tick_rate(number)
Sets the number of clock ticks per second (under DOS only).
date()
Returns a sequence of integer representing the current system date.
command_line()
Returns the executable name, main program file name and extra words on the command line.
getenv(varname)
Returns the value of the environment variable varname as a string if there's any, or -1 otherwise.
platform()
Returns a small integer identifying the host OS.
system(command,mode)
Passes command to a new copy of the command interpreter (if there's any). On return, graphic mode handling is controlled by mode.
system_exec(invocation,mode)
Instructs the OS to run invocation as an external program and return the exit code of that program. Graphic handling on return is specified by mode.

15.6 DOS specific primitives.

   Even though OpenEuphoria almost drops support for DOS programs, it cannot do so completely and provides a limited set of commands to address specifics of this OS:

allocate_low(size)
Works as allocate above, except that the returned address is in conventional DOS memory, below 1Mb.
free_low(address)
Frees the block at address that was allocated using allocate_low.
get_vector(number)
Returns a {segment,offset} pair representing the address for interrupt number DOS handler
set_vector(number,address)
Sets the address for DOS interrupt number handler to the address address represents. address might be returned by get_vector.
dos_interrupt(16|32)(number,registers)
Calls DOS16/32 interrupt number with the CPU state in registers; returns a sequence of integer with the same meaning as registers with the new CPU state.

16 Mathematical functions.

   Mathematical function are described in more details in part C. Here is a list of what OpenEuphoria provides:
floor(x) the greatest integer not greater than x.
ceiling(x) the smallest integer not less than x.
remainder(x,y) x-y*floor(x/y)
abs(x) computes the absolute value of an integer or floating point number.
sqrt(x) calculates the square root of an object
rand(x) generates random numbers
set_rand(x) initializes the random number generator rand uses.
sin(x) calculates the sine of an angle
arcsin(x) calculates the angle with a given sine
cos(x) calculates the cosine of an angle
arccos(x) calculates the angle with a given cosine
tan(x) calculates the tangent of an angle
arctan(x) calculates the arc tangent of a number
log(x) calculates the natural logarithm
exp(x) calculates the exponential of a number
power(base,exponent) calculates base raised to the power exponent.
E the base of the natural logarithm (2.7182818...), or exp(1).
PI the circle perimeter/diameter ratio (3.14159...)
scale2(x) returns the exponent of the highest power of 2 not greater than the absolute value of the argument.
scale10(x) returns the exponent of the highest power of 10 not greater than the absolute value of the argument.
int_to_bits(some_int,size) Returns a sequence of 0's and 1's representing the size least significant bits of some_int. The first element of this sequence is remainder(some_int,2) when some_int is nonnegative.
bits_to_int(some_seq) Returns an integer made of the 0's and 1's of some_seq. some_seq[1] is the least significant bit of the returned integer, and so forth.


prev | next | contents