Elem - a program for the interpretation of elemental analysis results

What is elem?

Elem is a very simple program for the interpretation of elemental analysis results. It is text based, and could run on anything that has a command line and a C compiler (it has been tested on Unix and Windows). Elem comes with source code and it is completely free; you can do with it whatever you want except claiming you wrote it. However, it comes with absolutely no warranty.

Elem was developed in a hurry to speed up the interpretation of the elemental analysis results for a Chemistry thesis work. It served its purpuse very well, so we decided to make it available to the public. However, it lacks a friendly interface and good error-checking. You are free to improve it or modify it in any way. I would be grateful if you sent me the improved version.

How does it work?

Elem uses a brute-force approach. The user specifies in the input file--in addition to the experimental analysis results--the atomic or molecular blocks that should be varied by the program, and the ranges and steps of variation. The program then iterates to generate all the possible formulas that comply with the specification (optionally including the added restriction of electroneutrality). For each formula, elem calculates the relative error and sorts the results accordingly.

The brute force approach worked perfectly for the cases that we were studying, which involved few variable blocks with small ranges of variation. It might be grossly inefficient for more complex problems. Ideally, we could use an optimization algorithm, but why bother?.


cc -o elem elem.c eprintf.c


The easiest way to run elem is to type the following command line:

elem filename

where file filename contains the input information. See the example for the format of the input file.

To see a list of the command line options, type elem -h.

The program outputs its results to standard output (that is usually the screen). Most of the time you will want to save the results to a file for later inspection. To do that use your operating system's output redirection. Type

elem inputfilename > outputfilename

and the results will be saved in file outputfilename.

Real-life example:

A lead(II)-cysteine complex was synthesized from an aqueous lead nitrate solution and cysteine. To propose an empirical formula from the experimental analysis results, elem was used. It was possible to have nitrate as a ligand in addition to cystein, and there was also the possibility of having water of hydration. The input file was the following:

C 10.99 
H 1.57 
N 4.34
H2O	H2O		0	4	0.5
NO3	NO3-		0	2	0.5
Cys	C3H5NO2S-2	0.5	2	0.5
Pb	Pb+2		1	0	0

The file has two main blocks: the experimental block and the variables block.

The experimental block goes first and begins after the .EXP label (actually any line that begins with a dot works). Anything before the label is ignored, so this space may be used to write comments. The format of the experimental block is very simple: each line contains an element (which must exist in the elem.txt file) followed by its experimental percentage. You can specify as many as you want, but most elemental analyses give only C, H, and N.

The variables block begins after the .VAR label (or any line that begins with a dot). In this block you can specify the atoms or groups that you want to vary. Each line specifies one variable; in this example, the first line is water. Each variable is specified by five space- or tab-separated columns as follows:

  1. Name. This is the string that will be used to identify the variable in the output file. The name cannot contain spaces but it does not mean anything to the program, only to the user.
  2. Formula. The formula of the fragment. It should only use elements that are defined in the elem.txt file, and it should not use parentheses as they are not recognized. Charge may be specified by a + or - sign optionally followed by a number.
  3. Minimum. The minimum multiplicity for this fragment.
  4. Maximum.The maximum multiplicity for this fragment. If it is smaller than the minimum, the fragment is not varied (i.e., stays at the minimum). See for example the range for lead in the sample file.
  5. Increment.

The minimum, maximum, and increment are not required to be integers.

The variables block ends as soon as a line that does not comply with the above format is found. Conventionally, a .END label is placed at the end.

Execution of the program gave the following output:

Input file: in2.txt
Experimental composition:
C: 10.99
H: 1.57
N: 4.34
H2O (H2O): from 0.00 to 4.00 step 0.50
NO3 (NO3-1): from 0.00 to 2.00 step 0.50
Cys (C3H5NO2S-2): from 0.50 to 2.00 step 0.50
Pb (Pb+2): from 1.00 to 0.00 step 0.00
H2O: 0.00; NO3: 0.00; Cys: 1.00; Pb: 1.00; Charge=0.00
C=11.04%; H= 1.54%; N= 4.29%; O= 9.81%; S= 9.82%; Pb=63.49%; 
Total squared absolute error = 0.005422
Relative error by element: C=-0.46%; H= 1.66%; N= 1.09%; 

H2O: 0.50; NO3: 0.00; Cys: 1.00; Pb: 1.00; Charge=0.00
C=10.74%; H= 1.80%; N= 4.18%; O=11.93%; S= 9.56%; Pb=61.79%; 
Total squared absolute error = 0.1413
Relative error by element: C= 2.29%; H=-12.95%; N= 3.88%; 

... (Truncated for brevity)

The first part of the output is almost a repetition of the input file. It can be used to identify an output file, and also to make sure that the program interpreted the input file correctly.

The actual results come after the BEST 10 RESULTS title (note: it is possible to output a different number of results by using the -n command line option). Each four line block describes one composition. The first line gives the values of the variables for that composition; the second gives the elemental composition; the third gives the total squared absolute error, which is the criterion used to sort the results; and the last one gives the relative error by element.

The interpretation for this particular case is that the best empirical formula is Pb(Cys). By the way, it was the expected formula. :-) I decided to show only the simplest and best result, but more than a dozen compounds were obtained, and not all had such simple formulas with such simple interpretations. All the work was done by Fabiola Barrios in her B.S. thesis (Facultad de Quimica, Universidad Nacional Autonoma de Mexico, 2001).

Program files:

The latest version is 0.5.