Tuesday 9 December 2014

SMILES :)

Bismillahirrahmanirrahim

Assalamualaikum dear brothers and sisters!

It has been a tiring week since the final examination is around the corner. But giving up is not in our lives' dictionary. 

Today's post, maybe last maybe not is about SMILES.

SMILES
An abbreviation of Simplified Molecular-Input Line-Entry System. A specification in form of a line notation for describing the structure of chemical structures using short ASCII strings. It can be imported by most molecule editors for conversion back into two-dimensional or three-dimensional models of the molecules. It was initiated by the author David Weininger at the USEPA Mid-Continent Ecology Division Laboratory in Duluth in 1980s. (wikipedia)


Canonical SMILES
The SMILES format is a linear text format which can describe the connectivity and chirality of a molecule. Canonical SMILES gives a single 'canonical' form for any particular molecule. 

Isomeric SMILES
The version of the SMILES specification that includes extensions to support the specification of isotopes, chirality and configuration about double bonds.



Applications of SMILES
No.
Applications
Explanations and Examples
1
SMILES Bond
SINGLE ( - )
DOUBLE ( = )
TRIPLE (# )
AROMATIC ( * )

For example, Ethene C=CChloroethene ClC=C 1,1-Dichloroethene ClC(Cl)=C cis-1,2-Dichloroethene ClC=CCl Trichloroethene ClC(Cl)=CCl Perchloroethene ClC(Cl)=C(Cl)Cl ,  for cyclohexane and dioxane can be written as C1CCCCC1 and O1CCOCC1 
2
SMILES Aromaticity
Aromatic C, O, S and N atoms are shown in their lower case 'c', 'o', 's' and 'n' respectively. Benzene,pyridine and furan can be represented respectively by the SMILES c1ccccc1, n1ccccc1 and o1cccc1. Bonds between aromatic atoms are, by default, aromatic although these can be specified explicitly using the ':' symbol. Aromatic atoms can be singly bonded to each other and biphenyl can be represented by c1ccccc1-c2ccccc2. Aromatic nitrogen bonded to hydrogen, as found in pyrrolemust be represented as [nH] and imidazole is written in SMILES notation as n1c[nH]cc1.
Fluorenone

3
SMILES Isotopes
Isotopes are specified with a number equal to the integer isotopic mass preceding the atomic symbol. Benzene in which one atom is carbon-14 is written as [14c]1ccccc1 and deuterochloroformis [2H]C(Cl)(Cl)Cl.
4
Smiles Branches
Represented by enclosure in parentheses Can be nested or stacked Examples: CC(O)CC is 2-Butanol OCC(C)C is iso-Butanol OC(C)(C)C is tert-Butanol A branch cannot begin a SMILES notation A branch cannot immediately follow a double- or triple-bond symbol Example: C=(CC)C is invalid, but C(=CC)C or C(CC)=C are valid SMILES
5
SMILES Symbols
String of alphanumeric characters and certain punctuation symbols Terminates at the first space encountered when read left to right The ORGANIC SUBSET: B, C, N, O, P, S, F, Cl, Br, I
6
Other SMILES Atoms
Aliphatic or nonaromatic carbon: C Atom in aromatic ring: lowercase letter Designate ring closure with pairs of matching digits, e.g. c1ccccc1 is Benzene, whereas C1CCCCC1 is Cyclohexane
7
SMILES Charges
Specify attached hydrogens and charges in square brackets Number of attached hydrogens is the symbol H followed by optional digit Examples: [H+] proton [OH-] hydroxyl anion [OH3+] hydronium cation [Fe++] iron(II) cation
8
SMILES Cyclic Structures
Break one single or one aromatic bond in each ring Number in any order –.Designate ring-breaking atoms by the same digit following the atomic symbol. Numbers indicate start and stop of ring Same number indicates start and end of the ring, entered immediately following the start/end atoms Only numbers 1 – 9 are used. A number should appear only twice Atom can be associated w. 2 consecutive numbers, e.g., Napthalene: c12ccccc1cccc2
9
SMILES Conventions
Avoid two consecutive left parentheses if possible. Strive for the fewest number of possible branches Tautomeric bonds are not designated; enter the appropriate form
10
SMILES Fragments
Nitro N(=O)(=O), Nitrate ON(=O)(=O), Nitrite ON(=O), Sulfonic acid S(=O)(=O)O, Cyanide/Nitrile C#N, Azide N=N#N, Azido N+=N-
11
SMILES Metals
[Al] [As] [Au] [Be] [Bi] [Cd] [Ca] [Fe] [Hg] [K] [Li] [Mg] Na] [Ni] [Pt] [Sb] [Sn] [Zn] [Zr]


Examples of Molecular Structure and SMILES Notations

c1ccccc1
C1=CC=CC=C1
benzene
c1ccc2CCCc2c1
C1=CC=CC(CCC2)=C12
indane
c1occc1
C1OC=CC=C1
furan
c1ccc1
C1=CC=C1
cyclobutadiene
We hope this information is suffice for you to refer. Thanks for lending your time here. 

Have a good day, XOXO. 

No comments:

Post a Comment