Powerful and Easy to Use
Set Up Correlation Calculations
A series of intuitive dialog boxes in CODESSA's graphical interface guides you through the process of setting up the correlation, pairing property values with data files and determining which descriptors are to be computed.
Display Correlation Results
The GUI display the property data plotted against the computed multilinear regression line from the correlation equation.
- View Details of Descriptors
The convenient interface will show detailed information about descriptor values at each atom in the molecule.
Beautifully rendered 3-D Structures provide annotation of atom and bond descriptors. Fragments are defined by selecting atoms within a structure. Once a fragment string is defined, CODESSA will automatically identify the fragments within every structure. New Fragment descriptors can be calculated once fragments have been defined.
- Tables and Sets
Navigate Structures, Properties, Descriptors, and Correlations using extremely powerful Tables with complete filtering and sorting allowing you to see the data from any viewpoint.
The "Sets" tools can be used to define sub-groups of Structures or Descriptors to narrow your focus without losing any of the existing data. Create any number of "training sets" or "test sets" to fully explore your property.
Computes 600+ Descriptors from AMPAC Results
Simple descriptors reflecting the molecular composition of the molecule including numbers of atoms, numbers of atoms of a specific element, numbers of different types of bonds, rings, and molecular weight. Relative and average descriptors in this category are derived from various combinations.
Topological descriptors (also called topological indices) describe the atomic connectivity in the molecule. CODESSA calculates the following standard indexes: Weiner (atomic distance matrix), Randic (connectivity patterns) and Kier-Hall (connectivity), Kier (shape and flexibility), Balaban's J index and various information content descriptors (total, structural, complementary, bonding).
The optimized 3D structure from AMPAC is used to compute values describing: moments of inertia, 2D shadow areas, molecular volume, molecular surface area and gravitation indexes.
Quantum mechanical results from AMPAC provide the charges needed to compute this information which includes: minimum and maximum charges, absolute atomic charges, Zefirov partial charges, dipole moments and molecular polarizability. and a wide array of charged partial surface areas.
Charged Partial Surface Areas
This important set of descriptors was invented by Peter Jurs and are among the most frequently cited in the literature, and are intuitively related to many chemical and biological properties. The portions of molecules assigned a positive or negative charge, the total surface area, relative values and differences between these quantities all have chemical significance and are computed automatically by CODESSA.
These indexes represent the majority of the descriptors that CODESSA computes. Many are computed directly from molecular wavefunctions and include: quantum mechanical energy (electronic, nuclear attraction/repulsion) distributions, ionization potentials, electron affinities, and resonance/exchange energies. Other descriptors are computed from various characteristics of the molecular orbitals and include: HOMO/LUMO energy gap, MO energies, bond orders and nucleophilic/electrophilic reactivity indexes. Various values describing solvation energies are also available from the QM information including the Kirkwood-Onsager cavitation indexes.
ISODENSITY Surface and Electrostatic Potential (ESP) mapped surfaces from AMPAC or GAUSSIAN provide a new collection of very powerful descriptors.
CODESSA supports user-defined subsets of connected atoms within a molecule called Fragments. CODESSA can generate descriptors for these user-defined fragments utilizing atomic data for just those atoms. This allows the user to hide portions of the molecule or focus only on the common sub-structure of the training set.
For example, a Chlorine substituent on a benzene ring would dominate charge-based descriptors; whereas, by defining a benzene fragment, the effect of the Chlorine on the ring system can be explored separately.
CODESSA's thermodynamic descriptors are available from analyzing the vibrational modes predicted by AMPAC. Various partitions of the vibrational, rotational and translational energies complement informative combinations of heat of formation, entropy and normal modes.
Construct Custom Descriptors
Combine existing descriptors using mathematical operations to create new descriptors. Use count-based descriptors to normalize any molecular descriptor. Experiment and follow your intuition. As good as CODESSA may be, you still know more chemistry!
Advanced Correlation Development and Statistical Analysis
Heuristic Descriptor Selection and Correlation
The selection of the best subset of descriptors from the hundreds available in CODESSA is critical for derivation of reliable correlations. CODESSA's heuristic method (unique to CODESSA) follows a sensible and intuitive pathway for eliminating variables from consideration. Further statistical tests are applied to the remaining descriptors, producing a ranked set of the best correlations, rated by various quality indexes.
Correlation equations can be derived with different numbers of descriptors. As with all methods in CODESSA, the regression correlation coefficient (R2), F-ratio, standard deviation, and standard error are listed for the overall correlation and t-test values for each parameter are also provided.
Best Multilinear Regression
BMLR is an automated procedure to suggest which correlation might account for the most variation.