Transform description language

Each transform consists of two parts: a header and a body. The language is case insensitive. Optional fields are enclosed by brackets [].

 

Header

The header has the following items:

.rxn

Marks the beginning of a new transform. Notice the dot before rxn.

name "transform name"

transform name must be unique and begin with a letter. The name must be enclosed by double quotes.

type transform type

At the present time, type may be GP1 or GP2. To add new types it is necessary to modify the source code.

G1 functional group list

Specifies the functional group that keys a GP1 transform or one of the two groups for a GP2 transform. It is a comma-separated list of functional group names. The names must match those in the file funcgrp.chm.

[G2 functional group list]

Used only in GP2 type reactions; specifies the second group that keys a GP2 transform. Note that if GP1 = GP2, the transform is called twice (usually not a bad idea).

[path pathlength]

Used only in GP2, but it is mandatory for all GP2 transforms. pathlength is an integer. The length is the number of atoms between the root atoms of the two groups, including both root atoms. Thus, the minimum length is usually two.

[rating=rating]

A priori rating. Default=50. The rating is a predefined numerical variable and may be modified anywhere in the transform body.

[.comments]

Comments block. The comments are a list of double quote-enclosed strings. The strings are automatically concatenated. Newline characters may be specified with the \n (backslash-en) sequence within the string. The comments block ends at the next header field.

.start

Marks the beginning of the body

 

Body

The body of the transform is a statement list and goes after .start. The following statements are available:

 

Assignment

variable = expression

For example, rating = rating + 10 is a valid assignment.

 

If ... then ... else

if (expresion) then

statement list

[else

statement list]

endif

The classical conditional execution statement. The endif is mandatory, even if there the statement list is null or just one line.

 

Foratom

foratom (atom1 from atom0)

statement list

next

This is used to visit all of the neighbors of the atom specified by the atom atom0. The current neighbor is stored in the variable atom1.

 

Breakbond

breakbond(atom1, atom2)

Breaks the bond between the specified atoms. If the bond is double or triple, it breaks it completely.

 

Makebond

makebond(atom1, atom2)

Creates a single bond between atom1 and atom2.

 

Add

add(atom1, element)

Creates a new atom of the specified element and joins it to atom1 with a single bond.

 

Done

done

This is an extremely important statement and can be easily overlooked. Repeat: do not forget to put at least a done statement in each transform. After all operations required by the transform are applied to the molecule, the done statement is used to indicate this. What actually happens is that a copy of the working molecule is saved and a new working molecule is created. The new molecule is exactly as it was at the beginning of the execution of the transform. The execution continues at the next statement after done, so it is not like an "end" statement, it is more like a "save". For example, if done is within a loop, several different molecules may be generated by a single execution of a transform. There may be more than one done statement in a transform. If there is no done statement, the transform is executed but it returns no results!

 

Operators

Expressions are very similar to those in other languages such as Pascal or BASIC, so they may contain nested parentheses. The available operators are the following.

 

Is / isnot

atom is atom_type

These are used in expressions to check whether a given atom is of a given type. The types are reserved words, and may be any of the following:

methyl

primary

secondary

tertiary

quaternary

vinyl

carbonyl

alkynyl

nitrile

allene

alkyl

sp3

sp2

sp

noncarbon

nitrile

hydroxyl

ether

peroxide

carbonyl

nitro

unidentified

allyl

benzyl

alpha_carbonyl

alpha_alkynyl

alpha_nitrile

The is / isnot operator has the highest priority. The operator type is boolean.

 

Logical operators

and, or, not

The name says it all. The priority follows the typical order: not > and > or.

 

Comparison operators

>, >=, <, <=, <>, =

Notice that the equality operator is the same as the assignment operator. Surely this will be disliked by mathematicians and philosophers. The equality and not-equals operator may be used with variables of any type; the other comparison operators are only valid for numeric types.

 

Math operators

+, –, *, /

As usual, multiplication and division have a higher priority than addition and subtraction. Associativity is from left to right.

Types and variables

The variable types commonly used are numeric (integer) and atom. Variables do not need to be declared; their type is automatically assigned by the program. But the typing is global. That means that once a variable type is assigned the same variable cannot be used with other type, even in another transform. Other variable types include string and functional groups, but currently they cannot be manipulated, so they are more like constants.

There are several pre-defined variables.

 

A1 ... A9, B1 ... B9, P1 ... P9

These are all atom-type variables. They are assigned automatically at the beginning of each transform and are used as the basis for all atom-based operations. AN are the atoms of the first functional group, and are assigned according to the definition of the functional group (see the functional group description language guide). BN are likewise defined for the second functional group; obviously they are only useful in GP2-type transforms. PN are defined for the path between the two groups: P1 is the root of the first group, and the numbering goes on until the root of the second group. Notice that, for a 2-group transform, several atoms are assigned more than one variable (for example, A1 might be the same as P1).

 

Rating

rating is a numerical variable used to evaluate how chemically useful a transform is. It is a semi-quantitative measure of the yield, generality, cost, etc. of the reaction. It should be modified as necessary by the transform body if structural conditions warrant it.