Difference between revisions of "CoCoA:CoCoA5Client Overview"

From ApCoCoAWiki
(jot done thoughts and notes from discussion with John over the last week)
 
(tidying up the wiki...)
 
(7 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
=Design Overview=
 
=Design Overview=
 +
 +
==Caution: Ongoing Work==
 +
 +
This document is the result of ongoing discussion between John, Franceso and Michael. It is a collection of thoughts about where we want to go with the new CoCoA 5 client.
 +
 +
==Executive Summary==
 +
 +
* We have 2 modes: CoCoA 4 and CoCoA 5
 +
* We support the CoCoA 4 Language in CoCoA 4 mode close to 100% (whatever is humanly possible) - certain features are marked as depreceated (see below for a more detailed discussion on this)
 +
 +
==Modular design==
 +
 +
The CoCoA 4 client is neither properly documented nor is it obvious where all the parts of the interpreter are implemented. In addition is it possible to add built-in commands on different levels. For example the Print function is defined in the grammar as well as reintroduced in the parser.
  
 
The client consists of several pieces:
 
The client consists of several pieces:
Line 5: Line 18:
 
* lexer (prototype done)
 
* lexer (prototype done)
 
* parser (some code exists, needs work to set up a node tree that can be fed to the interpreter)
 
* parser (some code exists, needs work to set up a node tree that can be fed to the interpreter)
* interpreter (prototype done - works by emplying RTTI)
+
* interpreter (prototype done - works by employing RTTI)
  
The current goal is to get the parser in a shape that we can hook all three pieces together. For a more detailed discussion have a look at the following discsussion
+
The current goal is to get the parser in a shape that we can hook all three pieces together. For a more detailed description have a look at the following topics.
 +
 
 +
==Parser: CoCoA 4 compability mode vs. CoCoA 5 mode==
 +
 
 +
The interpreter won't need to differenciate between those two modes since it is the parser's job to create a node tree. Once a mode has been chosen the CoCoA 5 client stays in that mode unless it is reset. We will not provide the possibility of mixing CoCoA 4 and CoCoA 5 style code.
 +
 
 +
One feature that we strongly desire is the possibility to type variables when we instanciate them. This cannot be done in CoCoA 4 compability mode. Hence the parser would turn a definition of a function like
 +
Define F(A,B)
 +
into
 +
Any Define F(Any A, Any B)
 +
internally. The Any-type is the old CoCoA 4 type. Using the Any-type has certain drawbacks, for example the need to check type compability during runtime.
 +
 
 +
==To reference count or not to reference count?==
 +
 
 +
There is still an ongoing discussion whether we should reference count objects or not. While one can easily construct a case where reference counting would be of tremendous advantage the question remains whether this is worth the hassle since it complicates code. One possible way out of the mess is the ability to pass variables/objects to subroutines by reference. Being able to make an object/variable const may be an added bonus.
 +
 
 +
=CoCoA4=
  
 
==Parser: How many passes?==
 
==Parser: How many passes?==
Line 17: Line 46:
 
* third pass: build node tree and call the interpreter
 
* third pass: build node tree and call the interpreter
  
==Parser: CoCoA 4 compability mode vs. CoCoA 5 mode==
+
==Namespace==
 +
 
 +
We have two separate namespaces:
 +
# packages
 +
# indeterminates.
 +
In addition we have four different overlapping namespaces:
 +
# functions
 +
# identifiers
 +
# rings
 +
# aliases
 +
 
 +
==Variables==
 +
 
 +
Name of variable contains alphanumeric characters plus underscore. Has to start with either
 +
underscore or capital letter
 +
 
 +
==Panel==
 +
 
 +
Keep supporting it.
 +
 
 +
==Catch==
 +
 
 +
Keep supporting it.
 +
 
 +
==Features that might be depreceated==
 +
 
 +
* Block ... EndBlock If Condition
 +
 
 +
=CoCoA5=
 +
 
 +
==Parser: How many passes?==
 +
 
 +
We are confident that we only need to do one pass in CoCoA 5 mode.
 +
 
 +
==Namespace==
  
The interpreter won't need to differenciate between those two modes since it is the parser's job to create a node tree.  
+
One flat namespace, i.e. x cannot be variable and indeterminate at the same time.
  
One feature that we strongly desire is the possibility to type variables when we instanciate them. This cannot be done in CoCoA 4 compability mode. Hence the parser would turn a definition of a function like
+
A member in the namespace consists of alphanumberic characters plus underscores. The members are case sensitive.
Define F(A,B)
 
into
 
Any Define F(Any A, Any B)
 
internally. The Any-type is the old CoCoA 4 type. Using the Any-type has certain drawbacks, for example the need to check type compability during runtime.
 
  
==Parser: CoCoA4 compability mode - a closer look==
+
All members of the namespace are 7bit ASCII clean, i.e. no funny accented characters. Depending on further experiments the lexer might throw out all non 7bit ASCII characters before handing over strings to the parser. A Unicode version of the CoCoA 5 client seems rather unlikely at the moment.
  
While we attempt to be 100% compatible with CoCoA 4.X, we might have to weigh this desire against cleanliness of the code. Bending over backwards to implement obscure and/or little used features in the CoCoA4Language is not the highest priority at he moment.
+
==Variables==
  
==Typing of variables==
+
* Maximum length of variables defined in header file at compilation, suggested length 250 characters
 +
* two options for overlength variables: truncate and warn or throw error.
  
CoCoA 4 currently implements 20 different types. While we strive to implement all of those for the first CoCoA 5 release we will probably mark some of those types as depreceated and clearly warn people that some of those type will go away in future releases.
+
Needs more discussion
  
==New Datastructures==
+
==Datatypes==
  
One might miss certain datastructures in the current CoCoA 4 when implementing algorithems. Here are a couple suggestions from us:
+
One might miss certain datastructures in the current CoCoA 4 when implementing algorithms. Here are a couple of suggestions from us:
 
* Mapping: should function like a hash with key-value pairs, implemented using c++ mappings
 
* Mapping: should function like a hash with key-value pairs, implemented using c++ mappings
* homogeneous list: since currently lists in CoCoA 4 are composed of any-types sorting lists is very expensive (roughly cubic or worse). Having homogenous lists gives us the opportunity to implement very efficient sorting algorithems with n*log(n) average case cost.
+
* homogeneous list: since currently lists in CoCoA 4 are composed of Any-types, sorting lists is very expensive (roughly cubic or worse). Having homogenous lists gives us the opportunity to implement very efficient sorting algorithms with n*log(n) average case cost.
* double linked lists vs. arrays: lists are currently implemented a vectors of pointers. As a result inserting or deleting elements from a list is rather expensive. So we would like to offer two different kind of lists: One uses a vector approach so that elements can be accessed in constant time with the downside that inserting or deleting elements is expensive. The other one would use a double linked list, making inserting and deleting Elements very cheap with the downside that access to random elements would be expensive. Depening on the need of the algorithm, i.e. if we need to partition a list depending on a condition, one would choose one approach over the other.
+
* double linked lists vs. arrays: lists are currently implemented as vectors of pointers. As a result, inserting or deleting elements from a list is rather expensive. So we would like to offer two different kind of lists: One uses a vector approach so that elements can be accessed in constant time with the downside that inserting or deleting elements is expensive. The other one would be a double linked list, making inserting and deleting elements very cheap with the downside that access to random elements would be expensive. Depending on the need of the algorithm, i.e. if we need to partition a list depending on a condition, one would choose one approach over the other.
 +
*Since we will interface will many different external numerical libraries (think BLAS, Lapack) it could be benefical to add a datatype double (with all its know limitations) to CoCoA5 in order to make the life of plugin-writers easier. We obviously would also need functions to convert to and from double to arbitray precision types.
 +
 
 +
==Panels==
 +
 
 +
We have four keywords manipulating Panels in CoCoA 4: Set, Unset, Option & Panel. We intent to break the way panels work in CoCoA5.
 +
 
 +
Each package can and should have a panel. Values can be any available type. For eample:
 +
 
 +
Package Foo;
 +
 
 +
Panel
 +
  SUGAR:=False;
 +
  Name:="value";
 +
EndPanel
 +
 
 +
To access a panel value use $Foo.SUGAR; to assign use $Foo.SUGAR:=False; and to reset value use
 +
reset $Foo.SUGAR, to reset the whole panel do reset $Foo.
 +
 
 +
We store setting for the interpreter and internal function (like timestamp format) in the CocoA5 panel.
 +
 
 +
==errors and warnings==
 +
 
 +
Parser should omit warnings and errors, warning could be surpressed via an option. All errors and warnings should identify themselves by a uniq number enabling the users to look up further details in a manual. Those details should  demonstrate (on a simple piece of code) what is wrong and how to fix it.
 +
 
 +
==Catch and Uncatch==
 +
 
 +
A good idea, similar to the concept of exceptions.
 +
 
 +
==TIME==
 +
 
 +
Keep TIME for compability, but introduce native type TimeStamp, i.e.
 +
 
 +
TimeStamp T1,T2;
 +
T1:=Now();
 +
DoSomething();
 +
T2:=Now();
 +
Print T2-T1;
 +
 
 +
Precision should be 1/100 of the second, just like in CoCoA4, or better depending on the platform. Linux under
 +
x86-64 for example provides higher resolution.
 +
 
 +
Format of time is set via CoCoA5 Panel. Write a timer package that would handle starting, stopping and
 +
pausing timers.
 +
 
 +
==Packages==
 +
 
 +
In order to keep the parser slim and the number of keywords down offer the ability to define functions as
 +
"builtin" and provide some way to map those to the Nodes of the interpreter. Also offer a similar way to register functions from plugins. We need to consider the way different platforms name their libraries, i.e. the library test would map
 +
* to libtest.so on MacOSX/Linux/Unix
 +
* to test.dll on Windows
 +
 
 +
Names of packages are case insensitive to account for case insensitive file systems like FAT32, NTFS and HFS. Besides alphanumeric characters and underscores we need the standard slash as a directory separator.
 +
 
 +
To be discussed later:
 +
 
 +
* how to export namespaces, what is visible from outside, i.e. functions starting with underscore are not publically available
 +
 
 +
==Input and Output==
 +
 
 +
Kill the internal device interfaces and properly use filehandles. Also make stdin & stdout easily available
 +
to CoCoA5.
 +
 
 +
==PlugIns==
 +
 
 +
Each plugin requires a package which defines the function it provides by declaring them External. The CoCoA 5
 +
client provides a well defined api to create, manipulate and delete datastructures from within CoCoA 5.
 +
 
 +
==Anonymous functions==
 +
 
 +
useful in certain circumstances like a user-defined sort-operator:
 +
 
 +
SortBy(L,(x,y)->(Deg(x),Deg(y)));
  
==To reference count or not to reference count?==
+
==Comments==
 +
 
 +
We allow three types of comments:
 +
 
 +
*--
 +
*//
 +
* /* .. */
 +
 
 +
The first two result in the rest of the line being ignored while the third kind can comment out whole sections of code. While the third kind cannot be nested in itself, it can comment out sections of code that contain the first two kinds of comment.
 +
 
 +
Comments inside strings are considered strings, i.e.
 +
String example="1 2 3 /* 4 */ 5 6"
 +
is a valid string containing
 +
1 2 3 /* 4 */ 5 6
 +
 
 +
==Multiplication by juxtaposition==
 +
 
 +
Assuming that all indeterminates in a given ring are single letters we will allow multiplication by juxtaposition in certain circumstances. A code snippet to demonstrate:
 +
 
 +
Use R:=Q[x,y,z];
 +
Poly f1;
 +
f1:={x^2y^3z+23zy^3}
 +
 
 +
==Input from interactive mode==
 +
 
 +
After cocoa recievin and processing input it should become obvious that the client can accept new input. This could be accomplished in the following way:
 +
 
 +
CoCoA 5 waiting for input:
 +
1: >
 +
The user inputs a command and presses return:
 +
1: > A:=1;
 +
Upon completion the client prints out a new line number and prompt:
 +
1: > A:=1;
 +
2: >
 +
 
 +
==Scope of variables==
 +
 
 +
To be discussed.
 +
 
 +
==History function==
 +
 
 +
Since we print out line numbers in interactive mode it is now possible for the user to recall the content of a given line quite easily. We imagine initially that a construct like
 +
!7
 +
would recall the the command from line 7 to the current input field.
 +
 
 +
==Open questions/ToDo==
 +
 
 +
* how should we switch rings
 +
* do variables/structures retain "their" ring after switching rings
  
There is still ongoing discussion whether we should reference count objects or not. While one can easily construct a case where refernce counting would be of tremendous advantage the question remains whether this is worth the hassle since it compilcates code. One possible way out of the mess is the ability of variables/objects to be passed by reference to subroutines. Beeing able to make an object/variable const may be an added bonus
+
[[Category:CoCoA5Client]]

Latest revision as of 17:45, 20 October 2007

Design Overview

Caution: Ongoing Work

This document is the result of ongoing discussion between John, Franceso and Michael. It is a collection of thoughts about where we want to go with the new CoCoA 5 client.

Executive Summary

  • We have 2 modes: CoCoA 4 and CoCoA 5
  • We support the CoCoA 4 Language in CoCoA 4 mode close to 100% (whatever is humanly possible) - certain features are marked as depreceated (see below for a more detailed discussion on this)

Modular design

The CoCoA 4 client is neither properly documented nor is it obvious where all the parts of the interpreter are implemented. In addition is it possible to add built-in commands on different levels. For example the Print function is defined in the grammar as well as reintroduced in the parser.

The client consists of several pieces:

  • lexer (prototype done)
  • parser (some code exists, needs work to set up a node tree that can be fed to the interpreter)
  • interpreter (prototype done - works by employing RTTI)

The current goal is to get the parser in a shape that we can hook all three pieces together. For a more detailed description have a look at the following topics.

Parser: CoCoA 4 compability mode vs. CoCoA 5 mode

The interpreter won't need to differenciate between those two modes since it is the parser's job to create a node tree. Once a mode has been chosen the CoCoA 5 client stays in that mode unless it is reset. We will not provide the possibility of mixing CoCoA 4 and CoCoA 5 style code.

One feature that we strongly desire is the possibility to type variables when we instanciate them. This cannot be done in CoCoA 4 compability mode. Hence the parser would turn a definition of a function like

Define F(A,B)

into

Any Define F(Any A, Any B)

internally. The Any-type is the old CoCoA 4 type. Using the Any-type has certain drawbacks, for example the need to check type compability during runtime.

To reference count or not to reference count?

There is still an ongoing discussion whether we should reference count objects or not. While one can easily construct a case where reference counting would be of tremendous advantage the question remains whether this is worth the hassle since it complicates code. One possible way out of the mess is the ability to pass variables/objects to subroutines by reference. Being able to make an object/variable const may be an added bonus.

CoCoA4

Parser: How many passes?

We are certain that a one pass parser is not sufficient. Hence the question remains: how many passes should we do? One suggestion:

  • first pass: determine whether we run in CoCoA4 compability mode or CoCoA5 mode.
  • second pass: determine the names and numbers of indeterminates, build listing of all functions and variables
  • third pass: build node tree and call the interpreter

Namespace

We have two separate namespaces:

  1. packages
  2. indeterminates.

In addition we have four different overlapping namespaces:

  1. functions
  2. identifiers
  3. rings
  4. aliases

Variables

Name of variable contains alphanumeric characters plus underscore. Has to start with either underscore or capital letter

Panel

Keep supporting it.

Catch

Keep supporting it.

Features that might be depreceated

  • Block ... EndBlock If Condition

CoCoA5

Parser: How many passes?

We are confident that we only need to do one pass in CoCoA 5 mode.

Namespace

One flat namespace, i.e. x cannot be variable and indeterminate at the same time.

A member in the namespace consists of alphanumberic characters plus underscores. The members are case sensitive.

All members of the namespace are 7bit ASCII clean, i.e. no funny accented characters. Depending on further experiments the lexer might throw out all non 7bit ASCII characters before handing over strings to the parser. A Unicode version of the CoCoA 5 client seems rather unlikely at the moment.

Variables

  • Maximum length of variables defined in header file at compilation, suggested length 250 characters
  • two options for overlength variables: truncate and warn or throw error.

Needs more discussion

Datatypes

One might miss certain datastructures in the current CoCoA 4 when implementing algorithms. Here are a couple of suggestions from us:

  • Mapping: should function like a hash with key-value pairs, implemented using c++ mappings
  • homogeneous list: since currently lists in CoCoA 4 are composed of Any-types, sorting lists is very expensive (roughly cubic or worse). Having homogenous lists gives us the opportunity to implement very efficient sorting algorithms with n*log(n) average case cost.
  • double linked lists vs. arrays: lists are currently implemented as vectors of pointers. As a result, inserting or deleting elements from a list is rather expensive. So we would like to offer two different kind of lists: One uses a vector approach so that elements can be accessed in constant time with the downside that inserting or deleting elements is expensive. The other one would be a double linked list, making inserting and deleting elements very cheap with the downside that access to random elements would be expensive. Depending on the need of the algorithm, i.e. if we need to partition a list depending on a condition, one would choose one approach over the other.
  • Since we will interface will many different external numerical libraries (think BLAS, Lapack) it could be benefical to add a datatype double (with all its know limitations) to CoCoA5 in order to make the life of plugin-writers easier. We obviously would also need functions to convert to and from double to arbitray precision types.

Panels

We have four keywords manipulating Panels in CoCoA 4: Set, Unset, Option & Panel. We intent to break the way panels work in CoCoA5.

Each package can and should have a panel. Values can be any available type. For eample:

Package Foo;
Panel
  SUGAR:=False;
  Name:="value";
EndPanel

To access a panel value use $Foo.SUGAR; to assign use $Foo.SUGAR:=False; and to reset value use reset $Foo.SUGAR, to reset the whole panel do reset $Foo.

We store setting for the interpreter and internal function (like timestamp format) in the CocoA5 panel.

errors and warnings

Parser should omit warnings and errors, warning could be surpressed via an option. All errors and warnings should identify themselves by a uniq number enabling the users to look up further details in a manual. Those details should demonstrate (on a simple piece of code) what is wrong and how to fix it.

Catch and Uncatch

A good idea, similar to the concept of exceptions.

TIME

Keep TIME for compability, but introduce native type TimeStamp, i.e.

TimeStamp T1,T2;
T1:=Now();
DoSomething();
T2:=Now();
Print T2-T1;

Precision should be 1/100 of the second, just like in CoCoA4, or better depending on the platform. Linux under x86-64 for example provides higher resolution.

Format of time is set via CoCoA5 Panel. Write a timer package that would handle starting, stopping and pausing timers.

Packages

In order to keep the parser slim and the number of keywords down offer the ability to define functions as "builtin" and provide some way to map those to the Nodes of the interpreter. Also offer a similar way to register functions from plugins. We need to consider the way different platforms name their libraries, i.e. the library test would map

  • to libtest.so on MacOSX/Linux/Unix
  • to test.dll on Windows

Names of packages are case insensitive to account for case insensitive file systems like FAT32, NTFS and HFS. Besides alphanumeric characters and underscores we need the standard slash as a directory separator.

To be discussed later:

  • how to export namespaces, what is visible from outside, i.e. functions starting with underscore are not publically available

Input and Output

Kill the internal device interfaces and properly use filehandles. Also make stdin & stdout easily available to CoCoA5.

PlugIns

Each plugin requires a package which defines the function it provides by declaring them External. The CoCoA 5 client provides a well defined api to create, manipulate and delete datastructures from within CoCoA 5.

Anonymous functions

useful in certain circumstances like a user-defined sort-operator:

SortBy(L,(x,y)->(Deg(x),Deg(y)));

Comments

We allow three types of comments:

  • --
  • //
  • /* .. */

The first two result in the rest of the line being ignored while the third kind can comment out whole sections of code. While the third kind cannot be nested in itself, it can comment out sections of code that contain the first two kinds of comment.

Comments inside strings are considered strings, i.e.

String example="1 2 3 /* 4 */ 5 6"

is a valid string containing

1 2 3 /* 4 */ 5 6

Multiplication by juxtaposition

Assuming that all indeterminates in a given ring are single letters we will allow multiplication by juxtaposition in certain circumstances. A code snippet to demonstrate:

Use R:=Q[x,y,z];
Poly f1;
f1:={x^2y^3z+23zy^3}

Input from interactive mode

After cocoa recievin and processing input it should become obvious that the client can accept new input. This could be accomplished in the following way:

CoCoA 5 waiting for input:

1: > 

The user inputs a command and presses return:

1: > A:=1;

Upon completion the client prints out a new line number and prompt:

1: > A:=1;
2: >

Scope of variables

To be discussed.

History function

Since we print out line numbers in interactive mode it is now possible for the user to recall the content of a given line quite easily. We imagine initially that a construct like

!7

would recall the the command from line 7 to the current input field.

Open questions/ToDo

  • how should we switch rings
  • do variables/structures retain "their" ring after switching rings