Throughout this documentation, written text may have several meanings which must be carefully separated. In order to do so, some graphical conventions will be used:
typewriter-like
fashion.
The vertical bar ( | ) will denote a choice to be made among a finite number of options, like in for|if|while.
The type of an atomic entity is defined by a basic type and, optionally, by a validation function. Nonatomic types are defined by the way other types are grouped together.
Actuaally, OpenEuphoria uses two distinct, not always separated, notions of a type. A value has a type when the function that recognizes that type, and bears the same name, returns True for this value. A variable has a type when it can hold values with that type.
Thus, the phrase "to pass a type check" is much more precise than "to have a type". Please consider that the latter is just a condensed way of writing the former.
The basic built-in atomic types do not need explicit user definition. They are:
Note that no reference to actual size is being made. The programmer must know that he can use 32-, 64- or 128-bit integers, but the source needs not to. If it does, it actually refers to some bit dtream representation of an integer. Such objects are dealt with using raw memory structures.
Nonatomic types group together other types in various ways. The supported layouts are:
The type string is a shorthand for sequence of char. The type fixedstring(number) is a shorthand for array(number) of char.
In a somewhat related way, arrays are just special kinds of sequences, so the only assignment problem is assigning a sequence to an array when lengths don't match. This will be handled by the fit() procedure (see 3.7.3.3 below).
Please also note that the object type can accomodate any nonatomic as well as atomic data.
A user-defined type is a refinement of (a refinement of (...)) one of the above types, defined through a validation function. Such a function is defined using the keyword type or reftype as a routine type. It may have side effects, and must return an atomic value.
A variable
x
has the user-defined typemytype
if:
- a routine named
mytype
can be called and is declared using the keyword type or reftype;mytype(x)
is a non-zero atom. or
- a record of the type
mytype
had been declared andx
has exactly as many fields as listed in the declaration ofmytype
, and each field ofx
has the expected type.
Please note that variables carry user-defined type information, but values don't. This subtle difference will surface in 5.3.2 (assigning elements to nonatoms and tracking their types).
The type of a member of a record may be specified using the context information of the other members of the record.
To do this, the member type must be declared as "_" in the record declaration. Later, the type declaration has the following form:
[global ]type|reftype record_name.member_type_name(parameters)
...
end type|reftype
parameters has one or two parameters (see 8.1 below).
To refer to members of the structure, use the .membername syntax.Example:
record stringWithIndex(
string s,_ index)
--type of the index member will be defined later as StringWithIndex.index
--index must be a positive integer not greater than the length of the s member of the same structure.end record
...some possibly unrelated code ...
type stringWithIndex.index(integer i)
return i>0 and i<=length(.s)
end type
A reftype might have been used as well.
1.5 Type checking.
1.5.1 The general case.
When a variable is going to be assigned a value, it may be checked that the variable can hold such a value. To determine this, the type or reftype function attached to the variable is called, and the check succeeds if it returns True. Any nonzero atom is interpreted as True. A failed type check causes an exception. The handler for this exception may take corrective action or just let the running program abort.
When the with type_check directive is in force, this process is performed on every assignment. As this impacts performance, systematic type checking may be turned off. For obvious reasons, type checking will still take place at a few implementation specific places.
1.5.2 Forced type checking.
When systematic type checking is turned off, you may wish to keep some control over which type checks are performed, because you need the checking - for instance, when the type functions have side effects.
To this end, you can add the keyword check as a prefix to a (ref)type. Types thus earmarked are always checked, even when type checking is off.
You can further restrict checking to some variables by creating a pair of twin types, one without the check prefix and one wrapping the latter, but with the check prefix.
You can further fine tune the checking by having user defined types that return True (check passed) unless some condition is met, in which case some real action takes place instead.
1.6 Type aliasing.
You can give alternate names to types. This may enhance code readibility, as the same type may have several interpretations in the same program. But it is mostly needed to call type checking functions for types with a compound name, like sequence of integer. You can do this using the statement:
type | reftype altname is aliasedThis creates an alias and a function. The alias altname can be used wherever aliased could be used. The altname() function checks the appartenance to the type aliased; it is a type or a reftype according to the statement used.
Note that you can use compound types as typechecking function identifiers. So, the call
is valid and checks whether
(sequence of array(4) of integer)(x)x
is a sequence of array(4) of integer. This scheme does not extend to other functions: their identifiers are only made of contiguous allowable characters.2 Basic tokens.
2.1 Identifiers.
An identifier is a string of consecutive letters, digits and underscores, starting with a letter. What a letter precisely means depends on implementation, but always includes the ranges 'a'-'z' and 'A'-'Z'.
Lower and upper case letters are different; contrary to some other languages, OpenEuphoria is case sensitive, which means that the exact spelling of an identifier is taken into account.
So, "var" and "VaR" are different, valid identifiers, while "_top" or "2read" are not valid. "3z2" may or may not be valid, according to implementation specific rules, and espcifically qhether they treat Ý as a letter or not. Euphoria only admits the minimal specificaion above as defining a letter.
2.2 Quoted characters.
A quoted character is a single (double-byte) character inside simple quotes, or an escape sequence, also inside simple quotes. Supported escape sequences are:
\' simple quote
\" double quote
\n newline (ASCII 10) (you cannot use \N)
\r carriage return (ASCII 13)
\t tab (ASCII 9)
\\ backslash
\(number) the character with ASCII/Unicode code number.Thus, 'f', '\n' or'\(3245)' are all characters.
2.3 Text.
Text is anything between double quotes. It is not processed at all, except for escape sequence resolution.
Verbatim text is enclosed between matching groups of three consecutive double quotes ("""). In verbatim mode, any character, including whitespace, is considered as part of the string. There is no special meaning for the backslash character, and there is no escape sequence processing as a result.
There is also an intermediate long text mode. Strings in this mode are enclosed between matching $" and "$. line_end characters '\r' and '\n' are ignored in long text mode, but other characters, including escape sequences, are treated as in normal text mode.
Example 1:
"This is a very long string which "&
"has to be broken for readibility reasons."could be written as:
$"This is a very long string which
has to be broken for readibility reasons."$Example 2:
"This is a very long string which \n"& "has to be broken for readibility reasons."
could be written as:
"""This is a very long string which
has to be broken for readibility reasons."""2.4 Numerical items.
They fall into four categories:
- sequences of decimal digits, optionally preceded by a plus or minus sign, and possibly containing a decimal point in any position (but not before the sign)
- sequences of hex digits, preceded by #.
- fractions, which take the form number/number. Both number must be integers or fractions themselves;
- scientific notation for numbers as follows: [sign]decdigit[.decdigits]e|Enumber.
The number must be in decimal digits.
Internally, OpenEuphoria performs as many automatic conversions as it needs, taking advantage of available hardware, to minimize memory usage by numerical items, while retaining the precision of these numbers.
2.5 The infinity.
Infinity is a mathematical concept alien to computers, as computers execute a finite number of machine instructions on finite size data; otherwise they are said to hang and must be switched off.
However, the infinity concept is embodied in some floating point numbers, and has formal properties that can be implemented on computers.
The [+]inf and -inf symbols are atoms with the following unusual properties:
Left operand Operator Right operand Result any atom + +inf = +inf any atom + -inf = -inf +inf + any atom = +inf -inf + any atom = -inf +inf + +inf = +inf -inf + -inf = -inf + -inf = -inf +inf + -inf = <error> -inf + +inf = <error> any atom - +inf = -inf any atom - -inf = +inf +inf - any atom = +inf -inf - any atom = -inf +inf - +inf = +inf - -inf = +inf -inf - +inf = -inf -inf - -inf = <error> any positive atom * +inf = +inf any positive atom * -inf = -inf any negative atom +inf = -inf any negative atom * -inf = +inf +inf * any positive atom = +inf -inf * any positive atom = -inf +inf * any negative atom = -inf -inf * any negative atom = +inf [+|-]inf * 0 = <error> 0 * [+|-]inf = <error> +inf * +inf = +inf +inf * -inf = -inf -inf * +inf = -inf -inf * -inf = +inf +inf / any positive atom = +inf +inf / any negative atom = -inf -inf / any positiv atom = -inf -inf / any negative atom = +inf [+|-]inf / 0 = <error> [+|-]inf / [+|-]inf = <error> any atom / [+|-]inf = 0 The <error;> in the rightmost column is the ZeroDivide exception when dividing by 0, and MathIndeterminacy otherwise.
Other mathematical functions may act on the infinity symbols, with sometimes useful results:
Function Symbol Result abs [+|-]inf +inf arcsin, arccos [+|-]inf ArgError cos, sin, tan [+|-]inf MathIndeterminacy scale2, scale10 [+|-]inf +inf log, sqrt +inf +inf log, sqrt -inf ArgError exp +inf +inf exp -inf 0 arctan +inf PI/2 arctan -inf -PI/2 The power function raises specific issues:
Base Expnent Result 0 < atom < 1 +inf 0 1 [+|-]inf 1 1 < atom, +inf +inf +inf 0 [+|-]inf MathIndeterminacy 0 < atom < 1 -inf +inf 1 < atom,+inf -ifn 0 any negative atom, -inf [+|-]inf MathIndeterminacy +inf any positive atom, +inf +inf +inf any negative atom, -inf 0 [+|-]inf 0 MathIndeterùinacy -inf positive even integer +inf -inf positive odd integer -inf -inf negative integer 0 -inf any non integer MathIndeterminacy 2.6 Comments.
Comments may appear at the end of any physical line of a source file. If the line was empty, it may start the line.
A comment starts by the characters -- and extends to the physical end of line (the next line_end).
OpenEuphoria does not process comments in any way, and comments don't affect the code generated from the source file they appear in. The facility is provided in order to document your code so that others, or possibly yourself, find understanding the code a relatively easy task, so that it can be maintained or upgraded fairly easily. Time spent commenting code will often bring a large reward in terms of cuts in maintainance and debugging time, if nothing else.
Precise, concise, relevant, useful commenting is an obscure art that may make the difference between ordinary and outstanding coders.
3 Operations.
They are defined by the use of infix or prefix operators, as opposed to routine calls, which use prefix identifiers acting on a list of arguments enclosed between parentheses.
Remember that an infix notation is one that goes in between its operands (like the usual multiplication), while a prefix notation appears before its operands.
3.1 Supported operators.
3.1.1 Supported operator list.
They are:
+ addition of numbers
- substraction of numbers
* multiplication of numbers
/ division of numbers
& concatenation of sequences
&& bitwise and
|| bitwise or
^ binary negation
~~ bitwise xor
<< binary left shift
>> binary right shift
>>> binary signed right shift3.1.2 Precedence hierarchy.
When more than two operators appear in a row, without parentheses separating them, there is a choice to be made: which operation to perform first? This is an important question, since the results may differ.
There is a predefined set of rules to help OpenEuphoria interpreter make a reasonable guess. The rules may be overridden using parentheses to force another evaluation order.
Here is the chart of operator precedence:
highest precedence: routine calls unary +/- bit-level operations * / lowest precedence: & Routine calls are evaluated first, then parenthesized expressions, starting at the deepest nesting level of them. The order of evaluation of items of the same precedence is undefined.
Thus, for instance, 3+2*4 is the same as 3+(2*4). To perform the addition first, code (3+2)*4.
Also, if the function f sets x to 3 whatever its argument, x+f(x) is 3+3=6 regardless of what x is.
3.2 Extension to nonatomic types.
If exactly one operand of any of the above operator is not atomic while the other operand is, the operation will be performed on each of the elements of the former operand. So, adding 1 to an array means adding 1 to each array element.
If both operands are nonatomic and have the same length, the operation is performed on each pair of matching elements in turn. So, {3,5}+{2,-4}={3+2,5-4}. An exception will occur on length mismatch.
Contrary to Euphoria, this scheme does not automatically extend to logical operators. The with seq_compat or with RDS directive turns the legacy behaviour on and off at will.
3.3 Formation of nonatomic objects.
The construct {{expression}} creates an object of non atomic datatype whose first element is the first expression in the list and so on. {} denotes an empty nonatomic object.
For this purpose, structures are ordered in the way their elements were declared in the structure definition of their type.
"" denotes an empty string.
3.4 Accessing elements of nonatomic objects.
Single elements of nonatomic objects (or nonatoms in the sequel) are accessed using an index enclosed between square brackets, as in:
ThisList
[4]. Note that any nonatom, even the returned value from a function, may be indexed.Structures have named parts, or members, which are used to access them using the syntax: record name.member name, like in:
ThisCustomer.name
.Since structure fields are declared in an ordered way, records also support indexed accessing: the index
n
then refers to then
-th field in the declaring enumeration.Indexes may be negative, in which case the elements are counted backwards. So,
ThisList
[-1] is the last element ofThisList
,ThisList
[-2] the second last and so on.You can use floating point numbers as indexes. They are rounded to the next integer downward before any further processing. So, s[-0.3] is s[-1].
0 is never a valid standalone index. See section 3.5 below for valid uses of 0 in index specifications.
Indexes whose absolute value are greater than one plus the length of the container they index always cause an exception. 0 and +/-(length(container)+1) are only allowed when specifying an empty slice (see 3.5.2 below).
3.5 Staticly accessing parts of nonatomic objects.
3.5.1 Nonempty slices.
Accessing several elements in a row is possible, and is done through slices. A slice is a comma separated list of indexes and ranges. A range is specified as lower..upper, where lower and upper are the lower and upper desired index values. Obviously, the latter is not less than the former, after conversion to positive standard indexes.
So, the statement:
NewList=ThisList[3,1..-4,-2,3..6]generates a list formed of elements of
ThisList
, in the following way:
NewList[1]
isThisList[3]
NewList[2]
isThisList[1]
NewList[3]
isThisList[2]
...
NewList[-5]
isThisList[-2]
NewList[-4]
isThisList[3]
...
NewList[-1]
isThisList[6]
Of course, if an element of a non-atom is a non-atom itself, several square bracketed index specifications may follow one another, like in:
mymatrix[1..3][4]
. This is a sequence of length three, exactly {mymatrix[1][4],mymatrix[2][4],mymatrix[3][4]
}.For structures, use names rather than indexes, even though they are just as valid ways to access structure parts. For instance, the following
CustList[27..41][name,zipcode,nbOrders]will generate a sequence of data extracted from a 15-element subsequence of
CustList
starting with the 27-th. We assumed thatCustList
is a sequence ofCustomer
s, which are structures the declaration of which involves members namedname
,nbOrders
andzipcode
. The statement above generates a sequence since the type of all its elements is the same. Each of its element is a sequence (of object) of length 3, since it is quite likely formed by a string, another string and an integer.
name
has a rank in the enumeration of fields that build theCustomer
type. If that rank is 3, you could code
CustList[27..41][3,zipcode,nbOrders]with exactly the same meaning as above. However, if that rank changes in future versions of your program, the "3" index will have to be changed to its new value, while the field name would remain the same. This is why using names is recommended over using indexes when possible.
3.5.2 Empty slices.
You may specify empty slices when they are made of ranges the upper value of which is exactly one less than the lower value. One of the index values must be valid though.
Thus, s[2..1], s[4..-5] and s[1..0] are all empty objects, assuming length(s)=7 so that -5 reads as 3. The last example is the only valid use of 0 in indexes; any other situation causes an exception. In the same vein, s[2..3,1..0] has length 2, while s[3..2,1..0] is empty, since the set of selected indexes is the union of two empty sets.
But s[13..12] causes an exception, since both index values are way out of range.
3.6 Dynamically accessing parts of non-atoms.
If
s
has nonatomic type and ift
has the format described below,s
[[t
]] is a valid syntax for a part ofs
with variable index depth. This syntax is specially handy for tree management.s
[[t
]] is s[t[1]][t[2]]...[t[length(t)]].Each t[i] must be a sequence made of atoms and sequences of length 1 or 2. Atoms are converted to sequences of length 1, and both specify single indexes. Sequences of length 2 stand for slices in an obvious lower..upper way.
For instance, assume t={2,{-1,{3,4}},{{1,4}}}. Then
s[[t]]=s[2][-1,3..4][1..4]This sequence has three elements, each of which is of length 4. Each of them consists of the 4 first elements of elements of s[2]. The first of these elements is the last of s[2]; the second and third are respectively third and fourth in s[2].
The sequence
t
is said to be a subexpression representation fors[[t]]
relative tos
.Additionally, note that a list of indexes may come from a sequence through desequencing (see 5.5.2 below), so that
s
[#(t
)] stands fors
[t
[1],t
[2],...,t
[-1] ].3.7 Manipulating nonatoms.
It is always necessary, once nonatoms are created and populated. While these manipulations can be done through a limited set of operations and routines stored in external files, this implies loss of performance and frequent reinventing of the wheel. For these reasons, OpenEuphoria provides quite a few built-in handling routines for nonatoms.
3.7.1 Getting information about nonatoms.
Functions are provided in order to know how many, and which, elements are in the nonatom:
- length(target)
- is the number of elements of the nonatom target.
- find(what,target)
- returns 0 if what is not an element of target. Otherwise, returns a positive integer, which is the index the first element of target to equal what.
- find_all(what,target)
- returns the possibly empty sequence of all integers which are indexes of elements of target which equal what.
- match(what,target)
- returns the lowest integer such that some slice of target starting at that position equals what. If there is no such integer, and what is not the empty sequence, 0 is returned. If what is the empty sequence, -1 is returned if target is not empty, and -2 if it is.
If what is an atom, match returns as find would.
- match_all(what,target)
- returns the possibly empty list of all integers
i
such that some slice of target starting ati
equals what. Always returns the empty sequence if what is the empty sequence.- wildcard_match(pattern,subject)
- Returns True if subject mtches pattern nd False otherwise. pattern may contain '?' and '*' wildcards.
- wildcard_file(pattern,subject)
- Same as wildcard_match, but platform-dependent, as the matches are case sensitive only on Linux/FreeBSD systems.
3.7.2 Adding elements to sequences.
Elements may be added to nonatoms as single objects or sequences, at any position in the sequence.
Here is a list of available routines:
- the & operator.
- If
s1
ands2
are nonatoms,s1
&s2
is a sequence of length the sum of the lengths ofs1
ands2
. The length(s1
) first elements ofs1
&s2
are those ofs1
; they are followed by those ofs2
.- append(
s
,x
)- is a sequence obtained by adding
x
tos
as its last element. Its length is length(s
)+1 whateverx
is.- prepend(
s
,x
)- is a sequence obtained by adding
x
tos
as its first element. Its length is length(s
)+1 whateverx
is.- insert(target,places,added)
- is a sequence where the elements of nonatom added are inserted as single objects inside the nonatom target at the locations given by the nonatom places. The length of the returned sequence is the sum of the lengths of target and added.
places must be strictly increasing; strange, but sometimes desired effects may result otherwise. places may be an atom, in which case it is converted into a sequence of length 1 before further processing.- insert_sequence(target,places,added)
- is a sequence where the elements of nonatom added are inserted as sequences inside the nonatom target at the locations given by the nonatom places. The length of the returned sequence is the sum of the length of target, the lengths of the nonatoms of added and the number of atoms in added.
places has rather being strictly increasing; strange, but sometimes desired effects may result otherwise. places may be an atom, in which case it is converted into a sequence of length 1 before further processing.3.7.3 Removing elements from sequences.
Three functions are provided:
3.7.3.1 The remove function.
remove(target,places) returns the sequence target from which the elements whose index belongs to the sequence of integers places were removed, notionally starting from the last. places is assumed to be sorted in ascending order; strange, but sometimes desired results might happen otherwise.
3.7.3.2 The replace function.
replace(target,places,items) returns a nonatom of the same type as target. It is obtained by removing from target the slices specified in places and replacing the carved out slices by elements of the sequence of nonatoms items, inserted as sequences. Each element of the sequence places is a pair of integers, the lower and upper bounds of each slice to process. items must have the same length as places, as each slice specified in places is replaced by the element at the same position in items.
If places has the form {
i1
,i2
}, wherei1
andi2
are integers, this is converted to {{i1
,i2
}} first. If places is just an integeri
, this is changed to {{i
,i
}}.3.7.3.3 The fit procedure.
The call fit(target,source,padding) causes source to be copied to target even though the lengths may not match. In this case, an ordinary assignment would have raised an error.
If length(target)<=length(source), only the portion of source that fits into target is copied, effectively discarding elements of source with higher indexes.
Otherwise, if padding is a char, the elements of target in excess relative to source's length are replaced by that char. padding may have the special value _, in which case these elements of target remain unchanged.
3.7.4 Permutations on non-atoms.
Nonatoms are ordered sets of elements; so, they can be reordered. As the number of permutations on a given number of symbols rapidly oncreases with that number, it is neither practical nor efficient to directly specify a permutation of a nonatom. However, the following functions cover the most frequent cases and can be combined into any sort of shuffling.
- reverse(target)
- returns the nonatom target with its order reversed: the first element becomes the last, the second element becomes the second last, and so on.
- move(target,start,end,where)
- moves target[start..end] to position where in target. An error will occur if the shifted slice extends past the end of the nonatom, ie if where+end-start-1 is greater than length(target).
- sort(target)
- Returns target sorted in increasing order.
- custom_sort(sort_id,target)
- Returns target, sorted in increasing order from the sorting routine's standpoint. Like compare, this routine takes two arguments and returns an integer in {-1,0,1}. The sorting routine has sort_id as its routine_id. routine_id is explained here.
3.7.5 Slice transfers.
Also called destructive assignments, they allow to simultaneously remove a slice from a nonatom and make it appear somehow in another one. The four following procedures perform this task:
- cut_paste(target,position,source,start,end)
- Removes the slice [start..end] from source and copies it onto target, starting at position. An error occurs if the pasted slice extends past the end of target.
- transfer(target,position,source,start,end)
- Removes the slice [start..end] from source and inserts it into target, the first element of the cut slice appearing at position.
- transfer_as_one(target,position,source,start,end)
- Removes the slice [start..end] from source and inserts it into target at position as a new element.
- swap_slices(target,t_start,t_end,source,s_start,s_end)
- Swaps the slices source[s_start..s_end] and target[t_start..t_end]. The slices need not have the same length.
Since these routines involve moving data away from its original location, their use may require a higher level of carefulness than other more classic nonatom handling tools.
3.7.6 Other operations on nonatoms.
Here are a few routines operating on nonatoms that didnt seem to fit in previous sections:
- repeat(what,howmany)
- Returns a sequence of length howmany, each element of which equals what.
- repeat_pattern(what,howmany)
- Returns a sequence of length howmany*length(what), made of howmany copies of what concatenated together. Acts as repeat above if what is atomic.
- lower(string)
- Convert all chars from string to lower case.
- upper(string)
- Converts all chars from string to upper case.
- sprint(anything)
- Returns the flat text representation of anything as a string. The I/O procedure print does the same job, but sprint outputs to a string and not to a I/O channel.
- sprintf(format,values)
- Outputs the result of replacing, in fomat, each format specifier by the value at the right position in values printed with this fomat.
- value(string)
- Reads a valid flat text representation of an object from string as get would from an I/O channel, and returns a pair {status,value} as get would.
3.7.7 A further note about slicing.
Euphoria 2.4 only allows slicing once, as the last operation after any number of simple subscritings. This allows to see subscripting and slicing as operators, as
t=s[1][2..3]
is the same ass1=s[1] t=s1[2..3]
.OpenEuphoria lifts the restriction, at the cost of considering slicing and subscripting as operators. The syntaxes allowed in Euphoria keep the same meaning in OpenEuphoria. However,
s[2..3][1]
is definitely not the same ass1=s[2..3] t=s1[1]
. The latter is a convoluted manner of assignings[2]
tot
.s[2..3][1]
is meant to be a two element sequence, namely{s[2][1],s[3][1]}
.
4 Condition evaluation.
Conditions are made of clauses linked together by logicals. A condition must evaluate to a boolean value of True or False. 0 stands for False, any other atom stands for True.
4.1 Truth tables for logicals.
A truth table is a table assigning a boolean return value to any couple of booleans. To draw truth tables esaily, we'll represent True by T and False by F.
4.1.1 The "and" operator.
! F ! T ! ---+---+---! F ! F ! F ! Read: "and" returns False, except when both arguments are True. ---+---+---! In that case only, it returns true. T ! F ! T ! -----------!
4.1.2 The "or" operator.
! F ! T ! ---+---+---! F ! F ! T ! Read: "or" returns True, except when both arguments are False. ---+---+---! In that case only, it returns False. T ! T ! T ! -----------!
4.1.3 The "not" operator.
! ! ---+---! F ! T ! Read: "not" returns True if its argument is False, and False ---+---! otherwise. T ! F ! -------!
4.1.4 The = operator.
! F ! T ! ---+---+---! F ! T ! F ! Read: "=" returns True when its operands have the same boolean ---+---+---! value; else it returns False. This is the truth table of the T ! F ! T ! "not xor" logical operator. -----------!4.2 Short-circuit evaluation.
From close inspection of the tables above, it follows that you need not always compute both arguments of a logical to know its return value; computing the first one is often enough.
This saves useless instruction execution, and may greatly simplify programming. The short-circuit rules are:
- The second argument of "and" is computed if and only if the first argument is True.
- The second argument of "or" is computed if and only if the first one is False.
Note that short-circuit evaluation applies to any use of logicals. This is not true in Euphoria, where it only applies inside the conditions of if, elsif and while statements. The with RDS directive turns compatibility mode on and off in this respect as well. The directive with[out] all_short_circuit allows to toggle this setting independantly of the others.
4.3 Example code: finding an name in an address book.
Assume
Address
is a record type that has a member calledname
, and thataddrbook
is a sequence ofAddress
. Then
i=1
while i<=length(addrbook) and addrbook[i].name!="myname" doi+=1end while
if i>length(addrbook) then i=0 end ifwill scan the address book for a record whose
name
member is equal to "myname". A value of 0 stands for name not found; elsei
holds the ordinal of the first occurrence of "myname" in a member of a record inaddrbook
.Without short-circuit evaluation, this code would fail if "myname" is not found, because
addrbook[length(addrbook)+1]
would be evaluated, causing an exception. In such a case, the code would be something like:
found=0
for i=1 to length(addrkook) doif addrbook.name="myname" then
found=i exit
end ifend for
So, an extra state variable is needed: even if i is available after the end for statement, a maximal value for i may mean that "myname" appeared as the last name or did not appear at all. The
found
variable is 0 on failure, and else means as above. And what if there was no exit statement?4.4 Side effects.
As routine calls are resolved first, they may affect the variables appearing in a condition.
Further, it may be desirable to record the value of expressions that appeared inside conditions. Because of the short-circuit evaluation capabilities of OpenEuphoria, it is not always possible to compute the expressions prior to the condition evaluation, as this may raise exceptions.
To address this situation, you can embed assignments in conditions, using the := form of the assignment operator.
So:
if f0(a)=x:=f(a) and b=y:=g(a) then ...will result in the following:
- x will always hold the value of f(a), regardless of what happens next. The =f(a) assignment might have been taken out of the if statement, for better readibility.
- if f0(a)=f(b), y will hold the value of g(a) at the time it was computed,thus taking into account the possible side effects of f and f0. It is not modified otherwise.
An obvious use of this feature is to know why an if block was entered or not in the case of several clauses in the condition.
5 Variables.
Variables are sets of data with enough logical links to be referred to by tags. These tags help identify this data for the program where they appear to act upon them. These tags are general_identifiers.
5.1 Properties of a variable.
A variable has a number of attributes, or attached data that can be retrieved. They are called metadata, and are retrieved using the construct general_identifier@meta.
For variables, the available metadata are:
- name
- x@name is "x". Seems redundant, but see 5.4 below.
- assigned
- x@assigned is False if x never was assigned a value, True otherwise.
- value
- x@value holds the contents of x. Valid only if x@assigned is True. Provided for completeness only.
- size
- the number of bytes x occupies in memory. This is mainly useful for interfacing with other languages.
- type
- the routine id of the type checking routine assigned to the variable when it was declared.
- deftype
- the routine id of the common type of all elements of a nonatom. This is -1 for atoms.
- id
- an integer you can use as an alternate way to access x (see 5.4.
- scope
- a value that tells in which part of the program the symbol is defined. See 5.2 below.
- format
- a default format used to display the value of the variable. See format string specification in the entry for printf() in part C.
- decl_mode
- this is True if the variable was declared using new_var(), and False otherwise.
- readonly
- roNo for variables, roYes for locked variables, roConst for constants.
- types
- is a sequence of integers the length of the nonatom. Each integer is the routine_id of the type function of the matching element. For atoms, this sequence is empty.
Only the value and format metadata can be directlly changed; the other attributes are read-only, or can be changed only through dedicated routines. Metadata also apply to RAM structures, but with slight vatiations.
However, due to their special nature, the deftype metadata of raw RAM structures can be changed. See the specifics under specific meaning of metadata.
A structure of all metadata a single symbol has can be retrieved using the get_meta function. The structure has the reserved type SystemVarMeta and has elements with the names and indexes as in the list above. The argument of get_meta is an expression evaluating to the id or name of the variable the metadata of which are requested.
Formats are specified like for printf() use. See the entry for this function in the alphabetical part C. The @format is used only if it has another value than "", which it has by default.
5.2 Scope of a variable.
A program is made of a main file (the one you feed the interpreter with) and zero or more auxiliary files. Named scopes inside files may exist (see Chapter 6, "Included files and namespaces".). Both are referred to as abstract files.
A symbol can be visible:
- from more than one abstract file in the program
- from the abstract file it lives in only
- from only part of a single file
As a result, the scope metadata has three possible values: sGlobal, sPublic and sPrivate, respectively.
Symbols that have a different scope coexist together. But, at any given time, only one of them is referred by the name they share. This symbol is said to shadow the others.
Private symbols shadow public symbols, and public symbols shadow global symbols without namespace.
Clash between two symbols sharing the same name and both visible at some point is an error condition, since the interpreter does not know which one the general_identifier designates. Obviously, the error occurs only when the ambiguous symbol is used.
The word "symbol" is purposely used here instead of "variable", because the notions above also apply to routines (see Chapter 8 "Routines").
5.3 Declaring a variable.
5.3.1 Type of a variable.
Types in OpenEuphoria describe logical properties of values a variable may hold. There are four ways to declare a variable, and all but one require an explicit type:
- declaration in a var-decl statement;
- declaration by on-the-fly creatiion;
- declaration as formal parameter of a routine;
- declaration as a for loop index
Only in the last case explicit typing is absent. But, from the values the three parameters of a for loop have, an integer or atom type is guaranteed.
Formally, there are four sorts of types in OpenEuphoria:
- Built-in types do not require definitions; the language gives them away for free. They are listed in Chapter 1.
- Nonatomic types built using built-in types
- Used-defined types (see 1.3);
- Compound types not entirely made from buillt-in types (see Nonatomic types).
5.3.2 Type of a nonatom element.
Nonatoms rely on a default type, which is their deftype metadata. Elements of nonatoms may have any type, but they should pass the type checking thus defined. They are registered as having this default type.
The programmer always has the option to specify the type of an element in a nonatom using the cast primitive. When this happens, a type check of the current element using the supplied type is performed, again regardless of current type checking status.
A call to the cast primitive has the following form:
cast(container,indexes,type)
This procedure call acts on the nonatom container by name or id. It sets the type information for elements in container[indexes] to type. type is either a type name or the routine id of the type function. indexes may be any slice specification.
5.3.3 Declarations.
A variable must be declared before being used. There are no exception to this principle but for loop indexes.
A variable definition takes the following form:
, a type name followed by one or more items. These items are either variable names or name=value initialized variables.[global |static ]type {identifier[=value] }
The variable's initial value is computed before the variable is created. This allows an identifier to shadow another one while retrieving the shadowed value at initialization time.
The optional global keyword makes the variable(s) visible outside of their current abstract file, giving them a scope metadata of sGlobal. It is not allowed for private variables in routines.
The optional static keyword applies to routine private variables. It makes their values persist between invocations of the routine.
A declaration may appear in any place outside routines or blocks.
Declarations in routines must be grouped right after the routine definition, as in:
function deloddnumbers(sequence of integer s)
integer i=1
sequence s0--you can't move any of the two lines above past here.
s0=remainder(s,2)
while i<=length(s) doif s[i]=1 then s=remove(s,i)end while
else i+=1 --it is easy to forget, but definitely necessary...
end if
return s0
end functionAdditionally, since routine variables are private, you cannot declare them as global.
Section 5.4 below will show you how to relax the restrictions above.
5.3.4 Constants.
Constants are identifiers that are assigned a value at initialization time. That value cannot change hereafter. Using constants instead of hard-coded repeated values is recommended for two reasons at least:
VAT_rate
may look more self-explanatory, when looking at the program souce for maintainance or debugging, than say 0.0825;- If the repeatedly hard-coded value is to change, all relevant instances of it must be changed in scattered places: this may prove a tedious, error-prone process. On the opposite, declaring that value as a named constant allows to change only one place in the code.
Declaring a constant takes the following form, quite similar to a variable declaration:
[global ]constant {[type ]name=value}Indeed, you can declare any number of constants in a single statement.This statement may appear everywhere a variable declaration is allowed. Contrary to variables, the type secification is optional, a type of object being assumed if it is not present.
It may happen that a constant is declared with some value even though a constant with the same name and the same value is visible. In this case, the duplicate declarations are ignored; Euphoria throws an error in this situation. Note that a constant defined inside a routine cannot be global.
Attempting to modify the value of a constant wil raise an exception. There is no way to change the value of a constant using only OpenEuporia statements.
5.4 Variable id's.
5.4.1 The id metadata.
Rather than being referred to by its name, a symbol can be accessed through its id metadata. Routines will have routine_id's (see Chapter 8), and variables have variable_id's. One may consider that all variables are named elements of a large sequence, and the variable_id's are indexes into this sequence.
When a variable is destroyed in any fashion, mainly because it is a private, nonstatic variable of a returning routine, its id is not recycled. This guarantees that an id always refer to the same variable or to no variable at all, which will cause an error on any kind of use.
The built-in function isvarid takes an integer and returns True if this integer is the id of a variable and False else.
Individual elements of nonatoms have variable ids, so that "s[3][5]@id" makes sense and returns an integer you can use as shown below. The id "follows" the element it tags during the transformations of the host array/sequence, so that the returned id may well give you the contents of s[2][7] if some elements were added or removed from s[3] or s.
5.4.2 Manipulating existing variables.
Five routines are provided to handle variables through their id's:
- id(name) returns the id of the variable whose name name evaluates to. -1 is returned if no such variable exists.
- get_var(id) returns the value of the variable with that id.
- set_var(id,value) sets the value of the variable with that id to value.
- var(id) returns the name of the variable with that id. For elements of nonatoms, their name is returned, or "" if none is applicable.
- analyze_id(id) returns a sequence of object of length 3. The first term is the variable name, or element name, or "" for unnamed nonatom elements. The second element is the index of the element if applicable, or 0 else. The third element is the id of the parent if applicable, or id itself otherwise.
Recursively calling analyze_id will yield the index sequence by which you can access the element with id id deeply nested in a nonatom.Note that set_var will fail if the symbol with this id is not to be written to ( var(id)@readonly != roYes ).
Example: assume you have a variable named
balance
. Its value must be assigned to the variablecredit
if it is nonnegative, and to the variabledebit
if it is less than 0. You also want to print a message reflecting what has just been done. The printing format ofcredit
s may not be the same as fordebit
s.A simple solution can be devised using the tools above:
baltype={var_id(credit),var_id(debit)}
...
b_id=baltype[1+(balance<0)]
set_var(b_id,balance)
msg=sprintf("Your %s is " & var(x)@format,{var(x),get_var(b_id)})In Euphoria (2.4 and before), you'd have to explicitly write an if statement to perform this admittedly simple task.
Also note that, since variable id's are global, they can be used to access shadowed symbols or static private variables.
5.4.3 Creating variables on the fly.
It may be useful to create variables in other places than in variable declarations, specially inside routines. This can be done as follows:
id=new_var(type,name,_)
id=new_var(type,name,value)This is equivalent to saying in the proper place "type name" or "type name=value", and gives you the id for this variable. Note the use of the anonymous placeholder '_' when no initial value is provided.
Also note that variables can be created conditionally using this mechanism. You cannot new_var a global or static variable. A variable declared in this way is private if it is inside a routine and just public else.
Creating a symbol clashing with an existing one, or accessing an id that does not exist, are error conditions, as might be expected.
5.4.4 Deleting variables.
As new_var is primarily intended to create temporary variables, you may remove them once their short life span is over. This can be done as
del_var(id)
For obvious reasons, there are limitations to use such a tool:
- You cannot del_var a symbol not declared with new_var();
- The deleted symbol must not be shadowed at the time it is deleted. This condition can be restated as: id=var(id)@id.
5.5 Using a variable.
5.5.1 Variables and values.
If the general_identifier of a variable appears on the left side of an assignment symbol (see 5.5.2 below), its value will be subject to change:
- the righthand side of the assignment symbol is evaluated;
- the type function associated to the variable may be called, with the resulting value as an argument.
- if the variable can be written to, and if the above call returned a nonzero atom, the value becomes the new value of the variable. Otherwise, an exception is raised.
Otherwise, the value of the variable is substituted to its identifier at run-time.
If a variable is passed by reference to a routine (see 8.3), the routine will modify it only if it can write to it.
5.5.2 Assignments.
A variable may be assigned a value using its id and the set_var routine, or using an initialization on declaration; but these are by no means the most frequent way of doing it.
There are three ways to assign a value to a variable using assignment operators:
general_identifier assignment expression
#({general_identifiers}) assignment expression
#({general_identifiers})# assignment expressionIn the first form, a variable gets (modified by) the value on the right side. The second form allows this to take place on several variables at the same time, so that they are assigned, or modified by, the same value to which the righthand side evaluates.
The third form normally requires the righthand side of the assignment to be a nonatom. The first element of the list on the left side of the second # is assigned the first element on the right side, and so on until one or both sides run out of elements. It could be called "desequencing", as it sends the contents of a sequence to several variables. If the righthand side is an atom, it is treated as a sequence of length 1.
To retrieve only some elements from the righthand side in intermediate position (an element of higher index is retrieved), use the "_" universal placeholder where a variable would be expected. This effectively discards the value that would be in the assignment otherwise, so that that vaue is not even computed.
As an example, assume
seQ
is a sequence with a last name, a first name and possibly a phone number. You want to get the last name, and the phone number if available. For some reason, you don't want to test the length ofseQ
. You can do the following:#(name,_,phone)#=seQ
name always gets
seQ
[1].seQ
[2] is never isnspected. IfseQ
[3] exists, it is assigned to phone, otherwise nothing happens.5.6 Aliasing an element of a variable.
You can specify an alternate name for an element of a nonatom. This is specially handy when complex index specifications are involved. The available tools are:
name aliased as alias
rename aliased as newalias
unname aliasname supplies an alternate name for an element in an array or sequence.
rename changes an existing alias to another one.
unname makes an alias unavailable.Aliases, in all this section, are identifiers, while aliased has the form identifier[index specification]. They act exactly as structure members do. As a result, an element keeps its name even if its position in the host sequence changes, as long as it exists. If a named element is removed from a sequence in any way, it is considered as unnamed. Any reference to a name that does not exist raises the UnknownToken exception.
6 Included files and namespaces.
OpenEuphoria adopted the open philosophy of Euphoria in the sense that a lot of functionality is to be found in libraries rather than in the language itself. The main advantage is that anyone can customize or upgrade routines in the libraries easily - they are plain text source files -, rather than tinker with the OpenEuphoria interpreter/compiler source itself, which may be written in another language. The drawbacks are loss of performance, version conflicts and symbol clashes.
Physical files the program will look for stuff into are called included files.
6.1 Namespaces.
Because symbols from different files may share the same name and be visible from the same location in the program, there must be a way to unambiguously refer to any of them.
Namespaces are the way. They are identifiers that prefix the symbol name. The prefix is separated from the raw symbol name by a colon ':'.
Namespaces apply to global symbols only. By construction, there is only zero or one public symbol and zero or one private symbol to be seen from any given location in the program. However, global symbols do not necessarily harbor explicit namespaces. Global symbols are in the default namespace.
6.2 Including and naming a file: a first approach.
6.2.1 The include statement.
Auxiliary files are made available to the main file using the include statement:
include filename|(expression)
include filename|(expr) as namespaceRemember that a filename is eitker a string or a parenthesized expression whose run-time value is to be interpreted as a file name. The (generated) string is passed to the operating system as a filename as-is, and must conform to whatever syntax rules the OS enforces, like double quoting long file names with spaces in them.
The simplest form makes global symbols in filename visible to the other files. This may lead to symbol clashes, some of them are caused by files the coder did not write. See sections 6.3 and 7 below.
The second form allows using the prefix namespace: for global symbols in filename. Several filenames may share the same namespace.
When a file is included for the first time, its directly executable statements are executed. Other subsequent include statements relative to this file do not trigger this action. A file is defined by its explicit path when supplied, or by a canonicalized path otherwise. As most OSes allow to specify a path in many different ways, you may include the same file several times as if it were for the first time, duplicating created symbols, hopefully with different namespace prefixes.
However, this is not to apply to links. These are aliases that are provided nativly by some OSes, and as third party addons by others. As the user generally sets these aliases voluntarily to make a file appear where it is not, links resolve to the true file name with the supplied path.
include statements always declare namespaces: the default namespace is used even when none is supplied.
The same file may appear with various namespaces in the same physical file. This is not really a feature, but legacy behaviour. On the brighter side, various files may include the same file with different namespaces.
Using a string enclosed between parentheses causes the string to be considered as an expression, the evaluation of which is used as a filename.
6.2.2 Namespaces.
Namespaces are a way for a given file to refer to symbol in another given file. As a result, namespaces are known only in the file they appear after an as keyword. So, they are two sorts of symbol clashes only:
- clashes between symbols sharing the same explicit namespace: the coder is responsible for them and must alter his/her own code to set things right;
- clashes between symbols without namespaces may originate from files the coder did not write. And (s)he included them in order not to rewrite them. Tools are provided for the coder to manage such conflicts between external libraries; see below.
6.3 The import, promote and demote statements.
Because of the somewhat undiscriminating nature of the include statement, which has symbols appearing in two namespaces when one was specified, and which acts in the same way upon all symbols in the included file, another construct is needed to get a more controllable behaviour. Changing the rules for include would most likely break too much Euphoria code.
6.3.1 The import statement.
The statement
import filename|(expression) as namespacemakes the symbols of filename appear in the namespace namespace. The symbols are not visible in the default: namespace, contrary to what the include statement does.
A string immediately following import and enclosed in parentheses is an expression that must evaluate to a string. That string is then processed as a filenamme, just like it would for an include statement.
Thus, import (
misc.oe
) asmsc
will look for a record calledmisc
, with a member namedoe
, or a sequencemisc
with a named elementoe
. If this can be found and holds a string, this string is the filename to be imported.But import
misc.oe
asmsc
will look for a file calledmisc.oe
and will make its global symbols visible in the namespacemsc
only.The discussion about "the first time" a file is included applies to the import and include statements collectively.
6.3.2 The promote statement.
Because it is sometimes convenient or useful to use global symbols without using prefixes, it is possible to select symbols to be promoted to the default (unprefixed) namespace.
The supported syntaxes are:
- promote "{identifier}" from namespace
- grant unprefixed access to symbols explicitly specified in the list.
- promote identifier from namespace
- identifier is assumed to be a sequence of strings, each of them being the name of a promoted symbol as above.
- promote _ from namespace
- promotes all symbols from namespace.
- promote but list from namespace
- promote all symbols but the supplied exclusion list. The list may have any of the first two forms above.
Promoted symbols can then be accessed as if the file they come from had been included using the longer form of the include statement.
6.3.3 The demote statement.
Promoted symbols can be demoted, which means they still exist in their source namespace, but no longer in default. The following syntaxes are supported:
demote " {identifier}" [from namespace]
demote identifier [from namespace]
demote _ from namespace
demote but "{identifier}" from namaspace
demote but identifier from namespaceallow to drop unprefixed access for the symbols listed, in an almost symmetric way as promote adds them.
Why "almost"? because the first two forms do not need to specify namespaces.
Indeed, there is normally one symbol of each name in the default namespace, and there is no ambiguity in the command given, hence no systematic need for the extra argument.
7 Scopes.
7.1 Named scopes.
The construct
[global ]scope identifier
... some code ...
end scopeallows to pretend that the enclosed code comes from an external file, whose name does not matter. Any code that may appear at some position in a source file may appear in a scope block at the same position.
A global symbol defined inside a named scope can be used:
- using the identifier: prefix, as if they ame from another file;
- without prefix; this requires the use statement (see 7.3 below).
Whether a scope has a name or not, it basically remains a private area of code. To allow a named scope to be used, you explicitly enable it usin the use name statement (see 7.3 below).
A scope declared as global can be considered as an external file, and is assumed to have been included using the include statement, with the scope name as a namespace. Alternately, a global scope is a scope that can be used from other physical files than the one it is in.
For this reason, namespaces are said to be associated to abstract files, and not only to physical files.
7.2 Unnamed scopes.
Unnamed scopes may appear in routines and, in this case, follow the same rules as code in routines.
This is specially handy when a small part of a routine needs a few variables not needed elsewhere in the routine. For clarity of code, putting them inside a scope block helps separate them from the ones most likely to be used.
A global symbol defined in an unnamed scope is visible from outside - more precisely, below - the scope in any scope it lies in, as you cannot use an unnamed scope. Thus, a global symbol defined in an unscoped, unnamed scope is just a filewide public variable (scoping and global sort of cancel out), while a global symbol defined in an unnamed scope inside another scope can be seen from the part of the outer scope extending from the end of tne inner scope to its own end. However, it can't be seen from outside the outer scope, unless this scope is named and used.
7.3 The use statement.
The global symbols of a named scope can be accessed without any prefixing by issuing the use scopename statement.
The use statement is always local, which means that its effects stop at the end of execution of code in the scope or routine it appears in.
8 Routines
A routine is a piece of code which can be called and eventually returns. The first phrase means that control may be transferred to the first executable statement of the routine. This is not necessarily the first general-code statement, since variable initializations are executable statements. The second phrase means that, when the routine is finished with its work, it returns control to the statement following the calling statement.
On return, a routine may provide a value, be it an atom or not. If the routine returns anything, the routine call is evaluated to the returned value.
There are several keywords to define routines, because they have different history and role to play.
8.1 Defining a routine.
The definition of a routine involves:
- optional attributes, which may be check, global or forward;
- a routine type keyword, chosen in the following list:
- routine
- coroutine
- procedure
- function
- type
- reftype
- handler
- an identifier, which must not clash with a variable name;
- a pair of parentheses enclosing a possibly empty list of formal parameters.
Section 8.3 below describes formal parameters of routines.
8.1.1 Routine types.
routine is the generic word designating a piece of code with its own variables that can be called and must return (unless it terminates some process).
A coroutine has the same behaviour as a routine as far as return values are concerned. However, the statement that is reached on calling a coroutine is either the first one as usual, or the statement following the last yield taken in that piece of code, if it was not returned from the conventional way.
A procedure is a routine which does not return any value.
A function is a routine which must return a value.
A type is a special sort of function. It must return a boolean, and takes exactly one argument. Specifying a type routine defines a user-defined type with the same name.
A reftype is a special sort of function. It must return a boolean, and takes exactly two arguments: an integer, which is the id of the variable to be assigned, and a reference to the value to be assigned to it, so that the function can modify the value to be assigned. Specifying a reftype routine defines a user-defined type of the same name.
A handler is a routine designed to handle events. These events are triggered at run-time, and handlers are not primarily meant to be called explicitly, even though they can as any routine. The argument list of a handler is described in 11.2 below.
Additionally, when a variable is assigned, the (ref)type function associated to it is executed prior to the assignment, and an rxception is raised if it retuns False (see Chapter 1, "Types", for details). For this reason, (ref)types may be considered as an hybrid between functions and handlers.
8.1.2 Forward declaration.
Sometimes, it makes the code clearer to use a routine even though it was not defined yet. In order to do this, you can use the forward attribute, followed by the routine definition.
When time has come to code the routine statements, just issue the statement routine_type routine_name, and go ahead with the statements in the routine. The full definition of the routine is already known, so that this shorter form is enough. You still can give the full definition again, but an SyntaxError error will occur if there is a mismatch between the two definitions.
Obviously, if a routine is declared forward and no flesh is added to this bone, an SyntaxError exception will occur at the end of the parsing of the source file.
8.1.3 Calling a routine.
An explicit routine call is made of:
- the routine name;
- a pair of parentheses enclosing a list of values, called arguments of the call.
The list of values should conform to the list of types specified in the routine definition. For instance, if
foo
was defined as
routine foo(integer i,string s), then a call to
foo
must be likefoo(
expr1,expr2). expr1 is checked to be an integer, and expr2 is checked to be a string. If one of the checks fails, or if there are not exactly two arguments, an exception will occur.If the routine is to return a value, and if this value is to be used, the routine call (routine name, parentheses and everything in between) is replaced by the returned value.
You may ignore the value returned by a function or routine by desequencing it to the empty list, as in #(_)#=
foo(i,s)
. Calling a function as a procedure causes a SyntaxError error, as well as calling a routine as a procedure when it returns a value.8.2 The return and resume statements.
8.2.1 Returning from a routine.
To signal that a routine must terminate and return control to the statement logically following the routine call, use the return statement. return by itself just terminates the routine; return expr does so and returns the value of expr.
The concept of "statement logically following the call" is as follows:
- if no value is returned, the statement logically following the call is the statement physically following the call;
- if a value is returned and the routine call is not part of a compound expression, the statement following the call is the assignment of the returned value;
- If the call belongs to a compound expression,the statement logically following the call is the next step in evaluating this expression.
If a routine does not have an explicit return statement, return is assumed right before the end mark of the routine.
8.2.2 The yield statement.
You may return from a coroutine using another statement than return or extended_return: yield. Returning using the yield statement records the location of the statement. Next time the coroutine is called, the statement following the yield just taken will be executed as first statement of the coroutine.
The state of the coroutine, which means the set of values taken by its private variables, is saved as well and restored on the next call. This allows an easy implementation of threads.
8.2.3 Resuming execution.
The resume statement asks the routine to terminate and reexecute the statement which triggered the routine. For this reason, this statement is meant for exception handlers and, normally, should not be excuted when the routine is called explicitly. There is no point in returning a value here, so that the mention of a value to return is not supported.
8.3 Formal parameters of a routine.
Formal parameters are characterised by three properties:
- a passing mode
- a type
- a name
8.3.1 Passing mode.
A formal parameter is either a single variable name, a constant or a more complex expression like "x+1". "variable name" extends to whatever might have a variable id, like a named sequence element or record member.
An argument which is not a variable name cannot be passed but by value.
However, when an argument is a variable name, two actions can be performed. Proceed as above is a first option; the alternative is to let the variable be temporarily aliased by the associated formal parameter in the routine body.
The first method is called "pass by value", and is the only method explicitly used by Euphoria. It is the default passing mode in OpenEuphoria.
The second method is called "pass by reference", and must be explicitly enabled in the formal parameter specification.
To allow passing by reference of a formal parameter, prefix it by the update keyword in the routine definition. When calling a routine, if the n-th argument is an expression other than a variable name and is supposed to be passed by reference, it will be passed by value instead. You can use the keyword byval just before the expression to emphasize that the effect of the update keywprd is temporarily suppressed.
The with byref directive controls the way OpenEuphoria checks passing mode. When calling a routine while with byref is in force (this is the default behaviour) , if the n-th argument is an a variable name and is supposed to be passed by reference, it must be preceded by one of the keywords byref or byval; otherwise an exception will occur. The variable will be passed by reference if byref is used, and passed by value if byval is used. If the directive is turned off, neither keyword is mandatory, as the routine definition holds all the relevant information.
Remember that, when a variable is passed by reference, any modification made by the called routine to the formal parameter to which this variable is mapped by the routine call is reflected to this variable. When passed by value, no modification is reflected, since the routine operates only on a local copy of the variable.
8.3.2 Parameter types.
The type of a parameter must be explicitly stated; however, preprocessors may help reduce the typing by allowing type aliasing or completion..
Specifying a formal parameter as array[(size)] or sequence allows to indicate that no specific type is expected for the elements of the nonatom passed. You may give a size to an array or just leave it out.
8.3.3 Example.
--the second parameter will be passed by value regardless of the update keyword, as nothing can be mapped to this compound expression.
without byref --to simplify things
forward function foo(string s,update string result,integer i)
...
seq="heLlo, woRld!"
...
x=foo(seq,append(seq,x),3)
--ERROR: length(s) is an integer, and the first parameter --is a string.x=foo(length(s),s,j)
--after the correct call tox=foo(seq,seq,0)
--this one is correct
...
routine foo
--nothing more to say, here comes the beef.
...
result=lower(seq)
foo
,seq
is "hello, world!",
--sinceresult
is the second parameter offoo
, is passed by reference and
--seq
is a variable name passed tofoo
as second argument, so thatresult
--aliases it.
Variables declared inside a routine are private and shadow any existing symbol with the same name. Formal parameters of routines have the same behaviour.
A routine can access by name any public or global variable in scope at the time of the call, as well as its own private variables. A private variable cannot be accessed by name outside of its routine.
On return from a routine, all its private variables cease to exist, except those declared as static.
Explicit invocation of a routine takes the following form: the routine name, a left parenthesis, the possibly empty, comma separated list of arguments, and finally a right parenthesis.
For instance, i=find("myself",someSequence)
is a routine call with
two
arguments. First argument is "myself"
and the second one is someSequence
. It is a function-like call, since a value is retrieved from the called routine on return.
The arguments must match the formal parameters in number and type. Failure to do so will raise a "ArgError" exception.
You can use a sequence to represent several consecutive arguments in a routine call. To do so, the sequence must be prefixed by the # desequencing sign, and be enclosed in parentheses to avoid any risk of being mistaken for a hex number.
Thus, if you want to issue the call foo(1,2,3,0)
, and you have a
nonatom
fooArgs
at hand with the value {1,2,3}
, you can issue foo(#(fooArgs),0)
with
precisely the same effect, which is to call the foo
routine with the four
arguments 1, 2, 3 and 0.
Just like variables, routines have an index called routine_id. Five functions are provided to manage dynamic calls:
- routine_id(expr) returns the id of the routine designated by the value of expr. If no such routine exists, -1 is returned.
- get_name(expr) returns the name of the routine whose id expr evaluates to. If no such routine exists, the empty string "" is returned.
- call_routine(expr,list) calls the routine whose id is given by the value of expr as a routine, with the argument list list. In particular, if the routine called does not take arguments, list should be {}.
- call_proc(expr,list) acts as call_routine, but applies only to procedures. Provided for compatibility only.
- call_func(expr,list) acts as call_routine, but applies only to function-like calls. Provided for compatibility only.
Built-in routines, even though they are defined in no file, have routine id's which can be retrieved as for any user-defined routine.
Global routines may be accessed by name when the abstract file they are defined in is included by other files. Any routine can be accessed through its routine id.
OpenEuphoria provides quite a few built-in routines, like the length() function (counts the number of elements), the integer() type (retuns True on integers and False else) and so on. They are treated as global symbols, and also have a namespace of their own, called builtin.
Defining a routine with the same name as an existing one in a given namespace (including default: or builtin:) generates a warning and shadows the preexisting routine with the newer one.
The @ construct used for variables (see 5.1) is also available for routines, with a few restrictions however because some of them don't always make sense. The available metadata for routines are:
- name
- the name of the routine
- type
- a small integer representing the keyword used to define the routine. The recommended mapping is
- routine
- coroutine
- function
- procedure
- type
- reftype
- handler
- id
- the routine id of the visible routine with the given name.
- format
- meaningful for types and reftypes only. Default format to be used to print variables of this type. Th default value for this metadata is "". If so, or if the routine is not a (ref)type, the value is ignored.
- types
- a possibly empty array of integers, which are the routine_ids of the types of the formal parameters, in the order that they were enumerated at definition time.
- scope
- sGlobal or sPublic, as for variables.
Unless stated otherwise, the meanings of the same metadata for routines or variables is very similar. The get_meta function is also available for routines (see 5.1) exactly as for variables, except that it returns a record of the reserved type of SystemRtMeta, with names and positions as in the list above.
Returns a sequence representing the calling chain that goes from the first routine call in the stack to the current statement; it does not include the call to call_chain() itself.
Each element in the returned sequence is either a string, which is a routine name or a file name, or a pair of strings. Each isolated string, or first string of a pair, represents the file scope or routine name the call took place in. The strings that appear second in a pair are labels attached to the calling statement, if any.
This statement allows to return from several layers of calls at once.
The statement comes in two flavours: extended_return(levels) and extended_return(levels,value). If any value is returned, it is the second argument of the second form; the returned value is not inspected by any code in the routines that are being skipped in this way. For both forms, levels is the number of successive return statements to perform. Thus, extended_return() with a first argument of 1 is equivalent to the classical return, except that parentheses are mandatory.
Code blocks are code between a blocktype statement and the matching end blocktype statement.
Blocks are nested, which means that the order of the blocktypes and the order of the end blocktypes must be exactly the reverse of each other. Failure to do so causes an irrecoverable syntax error.
The label identifier statement may appear at any place in the code, tagging the following statement with the name identifier. However, the uses of this tag depend of the nature of the tagged statement. A label statement barely qualifies as a statement, as it is never executed.
Any label statement can be the target of a goto statement (see 9.8 below). Thus, the program execution point may be transferred to the statement following labal.
Any goto statement must have an accompanying label statement. This way, the target may be aware of whether it was reached by direct branching using goto or by a more normal kind of execution flow. identifier may then be retrieved by the come_from() function.
When tagging a code block header, like a for statement, the identifier can be used in instructions which control code block execution (see 9.4 and 9.5 below).
When tagging a statement containing a routine call, the identifier appears in the data returned by call_chain() when invoked at any point downstream in the call chain.
Remember that this block may have a substructure:
if cond then
general-code
[elsif cond then general-code]
[else general-code]
end if
This will execute some general-code according to the values of the cond, according to the following rules:
- Each cond is evaluated until one of them is true or all of them are false.
- If one of the cond evaluates to True, the next general-code statements are executed until the next elsif, else or end if is found. Execution then resumes right after the end if statement closing the if block.
- Otherwise, the code following the else statement is executed if there is such a statement. Execution then resumes right after the closing end if statement.
When several courses of action might be taken according to the value of some expression, you can always stack a few elsif statements inside an if block. However, it may not be the clearest way to code this sort of situation, and this is why an alternative construct is provided. Also, you may want to take several branches in succession, which the if statement does not allow.
The structure of a select block is as follows:
select expr
case statement
[otherwise general code]end select
and a case statement is as follows:
- case expr
- general code
- case rel_op expr
- general code
- case expr thru expr
- general code
- case condition
- general code
The expression following the keyword select is evaluated, and this value is called the selector of the block. Decisions will be made according to this value. Each branch of the decision tree is represented by a case statement; an otherwise branch may be there as well.
The keyword case may be followed by four different types of items:
- a single expression, whose value is computed and matched against the selector. The general code that follows is executed if and only if the two values were equal.
- a relational operator, followed by an expression. As above, the expression is evaluated. The general code executes if and only if the condition selector rel_op (value of expr) is true. A relational operator of '=' may be omitted, since it leads to the case above.
- two expressions separated by the keyword thru. The two expressions are evaluated; the general code executes if and only if the selector is inside the closed interval those two values bound.
- a condition involving the symbol "_": the condition is evaluated as for an if statement, except that the "_" symbol stands for the selector. The general code executes if and only if the condition evaluates to True.
Each case statement starts by (a simplified form of a) conditional clause. The code following a case statement is executed whenever the corresponding condition is true. After that, if the block was not exited, the next case statement is inspected.
The process goes on until one of the three mutually exclusive situations happens:
- the select block is exited using the break statement. No more case statements are processed, and execution resumes right after the end select statement.
- the end select statement is reached: no action is taken, and the block is exited.
- the otherwise statement is reached: see section below.
This statement is optional, and is allowed only inside a select block as its last sub-block. It may appear at most once in a block.
If it is reached, and one of the branches of the select block was taken, the block is exited; otherwise, execution continues past the otherwise statement.
Allowed only inside a case branch, it causes that branch to be exited and the next case condition to be tested. If there is none left, the select block is exited.
The complete syntax is as follows:
for index=start value to end value [by increment] do
general codeend for
When the for statement is reached from outside the block, start value, end value and increment are computed. If the by clause is not present, increment is set to 1. They all must evaluate to atomic values. These values are not computed again during the subsequent loop iterations.
The loop index variable index must not have been declared, and is assigned start value. It cannot be modified.
If (start value-end value)*increment is greater than zero, no iteration is performed and the loop is exited, as there is no way for the index variable to get closer to end value. Otherwise, the first iteration starts.
If the index is not between the start and end values, and if the for statement is reached from inside the loop, the loop is exited without any further iteration. Execution resumes right after the end for statement; otherwise, a new iteration starts. On exit, the loop index variable remains available until the next for statement using the same identifier as its index variable. In Euphoria, the loop index vanishes otside toe for loop it was defined in.
When the end for statement is reached, the index is incremented by the increment and control is transferred to the for statement.
while cond do general-code end while
Executes an iteration of the loop if cond is true. Otherwise transfers control right after the end while statement.
The end while statement just causes the while statement to be executed again.
The complete syntax is as follows:
wfor identifier=start value to end value [by increment] do
general codeend wfor
This loop is an hybrid between a for and a while loop, hence the wfor name.
If the by clause is not present, increment is set to 1. These values are computed whenever the wfor statement is executed, and always must evaluate to atomic values.
The loop index variable identifier must have been declared. It is an ordinary variable which may be assigned inside the loop.
If the index is not between the start and end values, the loop is exited without any (further) iteration. Execution resumes right after the end wfor statement. Otherwise, a new iteration starts.
When the end wfor statement is reached, the index is incremented by the increment and control is transferred to the wfor statement.
Exiting a block means that the next executed statement is the one following the end blocktype statement which ends the block.
A code block will be said to be "active" relative to this statement if it contains the statement.
The exit statement can be used to exit a loop, the exif statement can be used to exit an if block, and the break statement allows to exit a select block.
They all have an optional argument. If they don't, the current relevant block is exited. Otherwise, the specified block (see below) is exited.
The optional argument of an exiting keyword is either a number or an identifier. The phrase "relevant block" translates to "loop block" when referring to an exit statement, an if block when referring to an exif statement and a select block when a break statement is involved.
If the argument is an identifier, it must be a label tagging an active relevant block. Labels are dropped using the label statement in 9.1 above. Failing this consistency criterium raises an exception. Otherwise, the block tagged by this label is exited.
If the argument is an integer greater than zero, this number is the number of relevant blocks nesting the current one that must be exited. Thus exit 1 means "exit the active loop above the current one", exit 2 exits the loop above the one above the current one, and so on.
If the number is negative, then the active relevant blocks above the current one are counted backwards from the top to determine the block to be exited. Thus, exit -1 means "exit the topmost active loop block", exit -2 means "exit the active loop just below the topmost one", and so on.
An argument of 0 is ignored, as it would only emphasize that the current relevant block is to be exited.
A loop iteration can be stopped at any point during its execution using the keywords next or retry. These keywords accept the same kinds of optional argument as exit.
This statement causes a new iteration of the loop to occur. This means that control is transferred to the opening statement of the loop, causing index update in for or wfor loops, and condition evaluation in a while loop.
This statement causes the current iteration of the loop to start again. This means that control is transferred to the first statement inside the loop block. Thus, the index of a for or wfor loop is not updated, and the condition of a while loop is not evaluated.
Described in chapter 7, they are also blocks and follow the general rule about nesting: a block can't end outside a block inside which it starts.
On top of all constructs above, which allow for an orderly yet easily managed execution flow, OpenEuphoria provides another tool to perform tasks the above would not allow to perform easily: the goto statement.
Computer science experts have fought over the value of goto as a statement in a high level language. Using it too much certainly leads to a hard to follow program execution flow, which makes maintenance and upgrading all the harder. The goto statement is probably more useful in rapid development stages than in production code, even thugh sparse and relevant use can really optimize a few things.
The statement goto identifier causes program execution to resume at the statement that immediately follows a matching label statement in the current routine or file scope.
A goto statement must be preceded by a label statement. This will allow the target to be aware that control was transferred to it using a goto statement. Failing to label a goto statement, or branching to an unavailable label, causes a runtime error.
This function takes no argument and returns the label attached to the last executed goto statement as a string. Most of the time, it is important for some statement to know it was reached by a goto rather than through a more conservative flow control command like if or while.
Reverts the effect of a goto statement by tranferring the execution point to the statement following the last goto taken.
This function clears the internal variable holding the label of the last goto taken, so that a subsequent come_back does not have unintended effects.
For certain special purposes, it may be legitimate to perform a far jump to a label in another namespace. This is done by goto_far(namespace,label). Just like the less far-reaching goto, it must bear a label. The equivalent of the other two statements are come_from_far(), come_back_far and goto_clear_far(). Their description is identical to those in 9.8.2 to 9.8.4 above.
A run-time debugger makes it extremely easy to debug a program, much easier at least than scattering a few print() statements and having to guess what is going wrong in program flow, variable assignments and other issues.
The integrated debugger is enabled by the with trace statement, and completely turned off by the without trace statement. This default behaviour saves execution time.
If the debugger is enabled, you start it by the trace(1) statement, and turn it off from the running program by the trace(0) statement.
A command/status line will be also provided, as the debugger may process user input (see 10.2 below) and display some information to the user.
The main debugger screen shows about 15 lines of code in 25 line console displays, more if console displays more lines, highlighting the one to be executed. This line will remain about the middle of the screen most of the time, so that some code before and after it can be seen always. It will be called the active line. Another line may be highlighted in some other way, and will be called the spot line.
Another part of the screen is reserved to show the values of most recently accessed variables. These values are updated as source statements are executed.
The debugger must be implemented in such a way that it will not trace itself, nor trace events it may (cause to) trigger.
The following actions should be requested using one-key keyboard shortcuts:
- toggle display between debugger screen and running application (recomended: F2);
- toggle display between color and monochrome display mode (recommended: F3);
- stop tracing and go (recommended: q);
- quit program and debugger altogether (recommended: Q);
- log executed statements to the previously defined trace file (asks for name if none) (recomended: L);
- stop logging (recommended: l);
- prompt for a trace file name (recommended: f);
- see more code upstream (recommended: PageUp for one page, UpArrow for one line);
- see more code downstream (recommended: PageDn for one page, DnArrow for one line);
- set the spot line as next statement to execute (recommended: F5);
- restore display of active line (recommended: End);
- execute active statement (recommended: Enter);
- undo the previous statement execution. The number of statements thus unexecuted may be limited (recommended: Backspace);
- toggle breakpoint at spot line (recommended: F8);
- show more of a large variable value in a text box (recommended: s);
- revert to normal display, closing the variable display box (recommended: S);
- reinitialize program and restart debugging from scratch (recommended: Home).
Scrolling through code using the mouse buttons, movements or wheel actions is to be provided.
Rather than immediate actions, the following are commands aimed at inducing specific behaviour from the debugger, or to set some trace scheduling.
The b command allows you to enter a conditional expression. This expression must be a valid OpenEuphoria condition. This condition sets up a dynamic breakpoint, which is triggered any time the condition is true. The expression may use any variable in scope at the time it is defined. Whenever one of these variables gets out of scope (for instance, returning from a routine), the dynamic breakpoint is disabled.
This breakpoint is independent from the static brakpoint F8 toggles on and off.
The ? command allows you to enter a valid OpenEuphoria expression. This expression will be treated as a most recently modified variable and displayed as such.
The s command will prompt you to enter an OpenEuphoria expression. if this exoression is not among the displayed variables, it is added as the ? command would. Moreover, a text box will open up and display a good deal of the expression value, quite more than ? would have allowed. The S command (10.1) closes the box.
The status line referred to in 10.1.1 will display information about the state of the static and dynamic breakpoint, as well as the indication of which one was last triggered.
Exceptions are situations which most likely arise from an error. Exceptions will cause default or user-defined routines - handlers actually - to be executed when available. This way, the program knows that something possibly went wrong and may take corrective action as needed to avoid or soften the crash. An exception is generated by hardware, which signals that something is amiss - no memory at this address, invalid floating point number, stack overflaw, whatever -, and software may take action to recover, or stop processing in the most graceful way possible.
Events are actions taken by the machine code being executed. Trapping them, also using handlers, allows to be informed of what is going on. Such hooks are of obvious use for debugging or profiling purposes, but they may serve many more useful programming needs as well. Events ae not triggered really; they are reports that some action, like calling a routine or reading a variable, is being taken. This signal may not be listened to, or be so in a limited number of cases.
Because the same mechanism is used in both contexts, the term of event will be used to refer to both exceptions and program events indifferently. The underlying physical architecture of the machine on which OE is running, or the design of its operating system, may change which exceptions or events are processed in software only or through hardware. As OE strives to be fully cross-platform, these details are supposed to be hidden from the user. In the event of exceptions to this principle, they will have to be fully documented in release documents.
A last note: code generated by most programming languages do trigger a lot of events. Most of the times, there is no way the software can hook the events and steer away from the default action, which may not be the most sensible or efficient in a given case. In line with the principle of openness, OpenEuphoria aims to provide total control, including in those situations that might go awry fast if the right move is not taken at the right moment.
This is done using the statement set_handler(id, event). id must resolve to the routine id of a handler. event must resolve to a string representing an event name.
get_handler(expr), where expr resolves to an event name, returns the id of the handler for this event.
A handler is called with five parameters:
The following table lists the events and exceptions that call a handler, the parameter they pass as a last argument and the default handler action.
Name | Last parameter | Default action |
AfterAssign | {} | does nothing |
AfterRead | value | returns the value |
AfterReturn | {value}, or {} if none | does nothing |
AfterWarning | {warning text,warning code} | does nothing |
ArgError | {argument #,value} | aborts |
BeforeAssign | value | calls type checking code, possibly issuing TypeError |
BeforeCall | argument list | checks types, possibly issuing ArgError |
BeforeExecute | statement text | does nothing |
BeforeIndex | {} | conveerts index to positive and checks if it is allowable, possibly issuing IndexBounds |
BeforeRead | {} | does nothing |
BeforeWarning | {warning text,warning code} | displays the warning |
ExternalOverflow | {max size,address} | aborts |
IndexBounds | value | aborts |
MathIndeterminacy | argument | aborts |
RaisedError | error message | prints the line # and the supplied error message, then aborts |
RuntimeError | {statement text,error code} | aborts |
StackOverflow | stack size | aborts |
SyntaxError | statement text | aborts |
TypeError | value | aborts |
UnknownToken | statement text | aborts |
ZeroDivide | {} | aborts |
Here is a more detailed account:
By default, events are all disabled for maximal performance. The with events directive allows to enable or disable any event or event pair at will.
More usage notes:
This procedure takes a string as its argument, and passes it, as xell as the usual four other arguments, to the RaisedError handler. Both second and third arguments are zero.
It may be desitable, when a resume and return instruction is executed, to execute a dynamically generated statement after leaving the handler, but before the standard action being taken. The argument for resume_execute and return_execute is an expression which is fed to execute at the appropriate time.
Example: assume that a string is being scanned, and its length may vary in the process. A while or wfor loop may do the trick, except that, after almost any statement modifying the scanning index or the length of the string, a check must be performed to avoid index out of bounds exception.
A clean solution then is to instruct the relevant handler to quietly exit the loop whenever this condition happens.
Thus, one may code:
IOBhandler=get_handler("IndexBounds")
handler IndexBounds(integer event,integer varid,integer index,index lineno,pdate object vAlue)
if varid=scanned@id then return_execute("exit")
--when scanned is subscripted with an out-of-bounds index, just exit
else call_proc(IOBhandler,{event,varid,index,lineno,byref vAlue})
--otherwise, chain to previous handler.
end if
end handler
--now the loop
i=1
while i<=length(scanned) do
...
--code that no longer needs repeated checks like
--"if i>length(scanned) then exit end if"
...
end while
set_handler("IndexBounds",IOBhandler) --restore previous handler
The code inside the loop got rid of repeated checks and is clearer and leaner as a result. There is hardly any performance loss, since the handler is invoked only on an error condition. As the index checks will be performed anyway, repeating them in code is sheer waste of CPU cycles, as they are mostly useless.
When an error causes an OpenEuphoria program to abort, it generates a file holding the values of all variables, as well as an error message stating the error, where it happened and, whenever possible, a traceback of all calls that led to the fatal error.
Additionally, a message is sent to stderr. The default messsage is made of the header and traceback part in the error file.
By default, the error file generated is called "oe.err". You can speccify another relative or absolute file name using the call crash_file(newFileName).
You can replace the default message with some of your own by calling crash_message(newMessage).
You may also want to apply some processing to the crash message, whether it is the default one
or not. You can do so using
crash_process(id). id is the routine_id of a routine that will take a string (the current crash
message) as a passed by reference argument and processes it (the default action is to return immediately) before display.
OpenEuphoria llows you to use strings to hold expressions or code to be executed.
The eval function takes a string as argument. This string must be a valid expression: eval evaluates it and returns its value. Thus:
s1="3+"
s2="length(s)"
x=eval(s1 & s2)
will assign 3+length(s) to the variable x
, unless s
is not a declared nonatom, in which case an error occurs.
This procedure takes a string as argument. The string must evaluate to valid OpenEuphoria code. This code is then executed as if it were hardcoded at the position of the execute procedure call.
This procedure may not be supported by all compiled or translated versions of OpenEuphoria, as providing support for this capability might prove particularly tricky or inefficient in these contexts.
Object oriented programming, or OOP, is not directly built in OpenEuphoria. External libraries will get notifications of OOP syntactic constructs and will have to implement these constructs.
The OOP library is to be included in the reserved OO namespace. Normally, Functions of the library are not called directly; the interpreter plugs in the appropriate OO calls, like a preprocessor would.
Action | OE syntax | Translation |
---|---|---|
Starts a class definition | class Identifier | constant Identifier=OO:begin_class() |
End a class definition: | end class | OO:end_class() |
Declares a private part of a class: | private do | OO:begin_private() |
Ends a private part of a class: | end private | OO:end_private() |
Declares a public part of a class: | public do | OO:begin_public() |
Ends a public part of a class: | end public | OO:end_public() |
Declares a protected part of a class: | protected do | OO:begin_protected() |
Ends a protected part of a class: | end protected | OO:end_protected() |
Apply a method to an object | Identifier1->Identifier2({expr}) | OO:call_method (Identifier1,Identifier2,{{expr}}) |
Get a member from a class: | identifier1->identifier2 | OO:get_member(identifier1, identifier2) |
Set a member from a class | identifier1->identifier2=expr | OO:set_member(identifier1,identifier2,expr) |
Even though OpenEuphoria has very specific features and has a more abstract definition of data types than most other usual languages, it is able to interface with RAM structures, external files or devices, and compiled libraries from other languages.
OpenEuphoria programs access files or devices using handles, or channel numbers. These integers are required by almost all communication functions.
The three lowest possible values for channels are reserved: 0 is the standard input (usually, the keyboard), 1 is the standard output (normally, the console) and 2 is the standard error (normally, the console also). Redirection is handled by the host OS and not by the language.
Associating a channel to an external file or device is done through the open(name,mode) function. name is a file or device name string passed to the OS and assumed to be recognized and duly processed by it. mode is a string taken from the following list:
In addition, you may add the "b" modifier to access files as binary rather than text ("rb" accesses a binary file for read only and so forth). Text files are organized in logical lines, separated by a format specific marker, and are supposed to hold values mapped to printable characters; binary files don't know about lines and may hold any kind of binary values.
A returned value of -1 means the association was not possible for a variety of reasons (file not found, access denied, device busy, unsupported mode, ...).
Once an I/O channel is defined, you can read from and write to it. If it supports random access, you can position a channel pointer to select a place to read from or write to.
The following functions read from a file or device, updating the channel pointer when applicable:
The following procedures write to a file or device, updating the channel pointer when applicable:
When a logical line is read from a channel, the OS specific line terminator is removed and replaced by a \n (ASCII 10) logical terminator. The reverse operation is performed on output. If no line is available from channel, -1 is returned.
?something is a shorthand for pretty_print(1,something,SystemPPOptions).
You may suspend, resume or stop access to a channel:
Some channels have a pointer, which you can get or set, that controls where the next read or write will occur:
OpenEuphoria has some types which have no equivalent in other languages, or may have to read data interpreted in different ways by other languages. Besides access to raw memory, OpenEuphoria recognizes main datatypes from other languages and can freely arrange them into raw structures, which are memory areas organized as standard structures.
OpenEuphoria provides the following functions and procedures:
Non OpenEuphoria data is known by its physical properties only. The following describes predefined and general external types. Although they are named "types", you cannot use them outside the context of structures in RAM without causing an error.
(Open)Euphoria uses the following predefined types to qualify data sent to external routines. The table below also mentions the translations, when they exist, in term of general external type, described in the next section.
Name | Meaning | Value |
C_CHAR | signed byte | #0100_0001 |
C_UCHAR | byte | #0200_0001 |
C_SHORT | signed word = signed byte2 | #0100_0002 |
C_USHORT | unsigned word = byte2 | #0200_0002 |
C_INT, C_LONG | signed 4 byte integer = signed byte4 | #0100_0004 |
E_INTEGER | signed 31-bit integer | #0600_0004 |
C_UINT, C_ULONG, C_POINTER | unsigned dword = byte4 | #0200_0004 |
C_XLONG | signed qword | #0100_0008 |
C_UXLONG | unsigned qword = byte8 | #0200_0008 |
C_FLOAT | single precision FP number = float32 | #0300_0004 |
C_DOUBLE | double precision FP number = float64 | #0300_0008 |
E_ATOM | Euphoria atom | #0700_0004 |
C_STRING | ASCIZ string = bytexx, last byte is 0 = delimited(0) byte | #0800_0001 |
E_SEQUENCE | Euphoria sequence | #0800_0004 |
P_STRING | first byte is the number of remaining bytes = counted1 byte | #0801_0001 |
P_XSTRING | first 4 byte is the number of remaining bytes = counted4 byte | #0801_0004 |
E_OBJECT | Euphoria object | #0900_0004 |
The value in the third column is actually aliased by the name in the first column. OpenEuphoria supports all external types Euphoria supports, plus 64-bit integers and Pascal/Ada counted strings.
OpenEuphoria also allows to describe a data type by mere physical properties. The general form of a general type descriptor is as follows:
For instance, C_STRING
is described by delimited(0) byte, while a P_XSTRING
is a counted4 byte. An array of signed 64-bit integers of length 17 will be signed byte8 (17). If X_STRING
is a C_STRING
that may be terminated also by two -1 in a row, then it is described as delimited(0,{255,255}) byte.
You can use predefined or general external types indifferently when they both exist.
OpenEuphoria allows to treat blocks of memory like ordinary structures, except that they have special types and can't hold ordinary types.
Declaring an ordinary or external structure is done in a very similar way, except that structure is replaced by memory. There are other less visible differences too.
There are no context types as for structures, but lengths can be expressed using other fields of the structure; the dot syntax is recycled for this purpose.
Thus, the following declarations
memory color(byte R,byte G,byte B) end memory
memory colors(
C_LONG nbitems,
color col_array(.nbitems)
end memory
defines a color
RAM structure as an RGB triple of bytes, and a colors
RAM structure that starts by an unsigned
dword, followed by an array of that many color
structures, called col_array
.
Any expression may be used, not only other fields of the structure.
Some structures come in various flavors, with the existence of some fields depending on the value of other fields, or of other factors, like an OS version. To address this situation, create a memory with maximal number of fields, and declare some of them as optional, so that partial structures of it may qualify as it.
For instance, the following declaration:
memory city_info(
C_STRING name,
C_STRING zipcode,
optional C_LONG population,
C_FLOAT area_acres
optional C_FLOAT average growth
end memory
actually defines three different structures that all qualify as a city_info
. The shortest kind only has the name
and zipcode
fields; the intermediate kind has all fields but the last, and the longest kind has all fields in the description.
As the physical size of a memory field is unambiguously known, it makes sense to apply a bitwise rotation to it.
The call rotate(mem,field,n) will rotate the field field of the memory instance mem n position to the right. Use negative values of n to induce rotation to the left.
The deftype metadata is meaningless for structures, and is set to 0. For memorys, it is set to 0 by default, but can be changed. If a memory type has optional fields, you can specify the number of optional fields you don't want to be taken into account.
For instance, in the city_info
example, city_info
@deftype may be set to 0, 1 or 2.
The shortest form corresponds to the value 2, the intermediate form to 1 and the longest default form to 0.
The id metadata doesn't make sense for memory instances, as they are not variables. For this reason, the id metadata of a memory instance holds its physical address instead. This allows to read from or write to a memory using peek and poke, as well as the other routines described below.
As OpenEuphoria types don't have a fixed size, writing and reading from memory must be done using speciic routines.
Additionally, assigning to and from memory fields is done using the = sign. When some OpenEuphoria expression has a result that doesn't fit the field it is written to, the ExternalOverflow exception is raised. It may take corrective action, including emulating infinite values etc.
OpenEuphoria programs may access routines and data from external files or processes, and share their own routines and data with other processes.
The following allows you to directly inspect or change the file system tree:
OpenEuphoria slightly breaks compatibility with Euphoria here, as DOS-specific calls are not supported.
The following functions are also provided:
You have some control and infrmation on the current process:
OpenEuphoria also provides a vaiety of system calls, including generic execution of shell commands:
Even though OpenEuphoria almost drops support for DOS programs, it cannot do so completely and provides a limited set of commands to address specifics of this OS:
Mathematical function are described in more details in part C. Here is a list of what OpenEuphoria provides:
floor(x) | the greatest integer not greater than x. |
ceiling(x) | the smallest integer not less than x. |
remainder(x,y) | x-y*floor(x/y) |
abs(x) | computes the absolute value of an integer or floating point number. |
sqrt(x) | calculates the square root of an object |
rand(x) | generates random numbers |
set_rand(x) | initializes the random number generator rand uses. |
sin(x) | calculates the sine of an angle |
arcsin(x) | calculates the angle with a given sine |
cos(x) | calculates the cosine of an angle |
arccos(x) | calculates the angle with a given cosine |
tan(x) | calculates the tangent of an angle |
arctan(x) | calculates the arc tangent of a number |
log(x) | calculates the natural logarithm |
exp(x) | calculates the exponential of a number |
power(base,exponent) | calculates base raised to the power exponent. |
E | the base of the natural logarithm (2.7182818...), or exp(1). |
PI | the circle perimeter/diameter ratio (3.14159...) |
scale2(x) | returns the exponent of the highest power of 2 not greater than the absolute value of the argument. |
scale10(x) | returns the exponent of the highest power of 10 not greater than the absolute value of the argument. |
int_to_bits(some_int,size) | Returns a sequence of 0's and 1's representing the size least significant bits of some_int. The first element of this sequence is remainder(some_int,2) when some_int is nonnegative. |
bits_to_int(some_seq) | Returns an integer made of the 0's and 1's of some_seq. some_seq[1] is the least significant bit of the returned integer, and so forth. |