Chemical graph generator

A chemical graph generator is a software package to generate computer representations of chemical structures adhering to certain boundary conditions. The development of such software packages is a research topic of cheminformatics. Chemical graph generators are used in areas such as virtual library generation in drug design, in molecular design with specified properties, called inverse QSAR/QSPR, as well as in organic synthesis design, retrosynthesis or in systems for computer-assisted structure elucidation (CASE). CASE systems again have regained interest for the structure elucidation of unknowns in computational metabolomics, a current area of computational biology.

History

Overlapping substructures of caffeine. Two substructures of a caffeine molecule are given, (A) and (B). The overlap of these substructures is highlighted in green in the caffeine structure (C).

Molecular structure generation is a branch of graph generation problems.[1] Molecular structures are graphs with chemical constraints such as valences, bond multiplicity and fragments. These generators are the core of CASE systems. In a generator, the molecular formula is the basic input. If fragments are obtained from the experimental data, they can also be used as inputs to accelerate structure generation. The first structure generators were versions of graph generators modified for chemical purposes. One of the first structure generators was CONGEN,[2] originally developed for the DENDRAL project, the first artificial intelligence project in organic chemistry.[3] DENDRAL was developed as a part of the Mariner program launched by the NASA to search for life on Mars.[4][5] CONGEN dealt well with overlaps in substructures. The overlaps among substructures rather than atoms were used as the building blocks. For the case of stereoisomers, symmetry group calculations were performed for duplicate detection.

After DENDRAL, another mathematical method, MASS,[6] a tool for mathematical synthesis and analysis of molecular structures, was reported. As with CONGEN, the MASS algorithm worked as an adjacency matrix generator. Many mathematical generators are descendants of efficient branch-and-bound methods from Igor Faradjev[7] and Ronald C. Read's orderly generation method.[8] Although their reports are from the 1970s, these studies are still the fundamental references for structure generators. In the orderly generation method, specific order-check functions are performed on graph representatives, such as vectors. For example, MOLGEN[9] performs a descending order check while filling rows of adjacency matrices. This descending order check is based on an input valence distribution. The literature classifies generators into two major types: structure assembly and structure reduction. The algorithmic complexity and the run time are the criteria used for comparison.

Structure assembly

The generation process starts with a set of atoms from the molecular formula. In structure assembly, atoms are combinatorically connected to consider all possible extensions. If substructures are obtained from the experimental data, the generation starts with these substructures. These substructures provide known bonds in the molecule. One of the earliest attempts was made by Hidetsugu Abe in 1975 using a pattern recognition-based structure generator.[10] The algorithm had two steps: first, the prediction of the substructure from low-resolution spectral data; second, the assembly of these substructures based on a set of construction rules. Hidetsugu Abe and the other contributors published the first paper on CHEMICS,[11] which is a CASE tool comprising several structure generation methods. The program relies on a predefined non-overlapping fragment library. CHEMICS generates different types of component sets ranked from primary to tertiary based on component complexity. The primary set contains atoms, i.e., C, N, O and S, with their hybridization. The secondary and tertiary component sets are built layer-by-layer starting with these primary components. These component sets are represented as vectors and are used as building blocks in the process.

Substantial contributions were made by Craig Shelley and Morton Munk, who published a large number of CASE papers in this field. The first of these papers reported a structure generator, ASSEMBLE.[12] The algorithm is considered one of the earliest assembly methods in the field. As the name indicates, the algorithm assembles substructures with overlaps to construct structures. ASSEMBLE overcomes overlapping by including a “neighbouring atom tag”. The generator is purely mathematical and does not involve the interpretation of any spectral data. Spectral data are used for structure scoring and substructure information. Based on the molecular formula, the generator forms bonds between pairs of atoms, and all the extensions are checked against the given constraints. If the process is considered as a tree, the first node of the tree is an atom set with substructures if any are provided by the spectral data. By extending the molecule with a bond, an intermediate structure is built. Each intermediate structure can be represented by a node in the generation tree. ASSEMBLE was developed with a user-friendly interface to facilitate use. The second version of ASSEMBLE was released in 2000.[13] Another assembly method is GENOA.[14] Compared to ASSEMBLE and many other generators, GENOA is a constructive substructure search-based algorithm, and it assembles different substructures by also considering the overlaps.

The efficiency and exhaustivity of generators are also related to the data structures. Unlike previous methods, AEGIS[15] was a list-processing generator. Compared to adjacency matrices, list data requires less memory. As no spectral data was interpreted in this system, the user needed to provide substructures as inputs. Structure generators can also vary based on the type of data used, such as HMBC, HSQC and other NMR data. LUCY is an open-source structure elucidation method based on the HMBC data of unknown molecules,[16] and involves an exhaustive 2-step structure generation process where first all combinations of interpretations of HMBC signals are implemented in a connectivity matrix, which is then completed by a deterministic generator filling in missing bond information. This platform could generate structures with any arbitrary size of molecules; however, molecular formulas with more than 30 heavy atoms are too time consuming for practical applications. This limitation highlighted the need for a new CASE system. SENECA was developed to eliminate the shortcomings of LUCY.[17] To overcome the limitations of the exhaustive approach, SENECA was developed as a stochastic method to find optimal solutions. The systems comprise two stochastic methods: simulated annealing and genetic algorithms. First, a random structure is generated; then, its energy is calculated to evaluate the structure and its spectral properties. By transforming this structure into another structure, the process continues until the optimum energy is reached. In the generation, this transformation relies on equations based on Jean-Loup Faulon's rules.[18] LSD (Logic for Structure Determination)[19] is an important contribution from French scientists. The tool uses spectral data information such as HMBC and COSY data to generate all possible structures. LSD is an open source structure generator released under the General Public License (GPL). A well-known commercial CASE system, StrucEluc,[20] also features a NMR based generator. This tool is from ACD Labs and, notably, one of the developers of MASS, Mikhail Elyashberg. COCON[21] is another NMR based structure generator, relying on theoretical data sets for structure generation. Except J-HMBC and J-COSY, all NMR types can be used as inputs.

In 1994, Hu and Xu reported an integer partition-based structure generator.[22] The decomposition of the molecular formula into fragments, components and segments was performed as an application of integer partitioning. These fragments were then used as building blocks in the structure generator. This structure generator was part of a CASE system, ESESOC.[23]

Breadth-first search generation. Molecular structure generation is explained step by step. Starting from a set of atoms, bonds are added between atom pairs until reaching saturated structures.

A series of stochastic generators was reported by Jean-Loup Faulon. The software, MOLSIG,[24] was integrated into this stochastic generator for canonical labelling and duplicate checks.[25] As for many other generators, the tree approach is the skeleton of Jean-Loup Faulon's structure generators. However, considering all possible extensions leads to a combinatorial explosion. Orderly generation is performed to cope with this exhaustivity. Many assembly algorithms, such as OMG,[26] MOLGEN and Jean-Loup Faulon's structure generator,[27] are orderly generation methods. Jean-Loup Faulon's structure generator relies on equivalence classes over atoms. Atoms with the same interaction type and element are grouped in the same equivalence class. Rather than extending all atoms in a molecule, one atom from each class is connected with other atoms. Similar to the former generator, Julio Peironcely's structure generator, OMG, takes atoms and substructures as inputs and extends the structures using a breadth-first search method. This tree extension terminates when all the branches reach saturated structures.

OMG generates structures based on the canonical augmentation method from Brendan McKay's NAUTY package. The algorithm calculates canonical labelling and then extends structures by adding one bond. To keep the extension canonical, canonical bonds are added.[28] Although NAUTY is an efficient tool for graph canonical labelling, OMG is approximately 2000 times slower than MOLGEN.[29] The problem is the storage of all the intermediate structures. OMG has since been parallelized, and the developers released PMG (Parallel Molecule Generator).[30] MOLGEN outperforms PMG using only 1 core; however, PMG outperforms MOLGEN by increasing the number of cores to 10.

A constructive search algorithm is a branch-and-bound method, such as Igor Faradjev's algorithm, and an additional solution to memory problems. Branch-and-bound methods are matrix generation algorithms. In contrast to previous methods, these methods build all the connectivity matrices without building intermediate structures. In these algorithms, canonicity criteria and isomorphism checks are based on automorphism groups from mathematical group theory. MASS, SMOG[31] and Ivan Bangov's algorithm[32] are good examples in the literature. MASS is a method of mathematical synthesis. First, it builds all incidence matrices for a given molecular formula. The atom valences are then used as the input for matrix generation. The matrices are generated by considering all the possible interactions among atoms with respect to the constraints and valences. The benefit of constructive search algorithms is their low memory usage. SMOG is a successor of MASS.

Unlike previous methods, MOLGEN is the only maintained efficient generic structure generator, developed as a closed-source platform by a group of mathematicians as an application of computational group theory. MOLGEN is an orderly generation method. Many different versions of MOLGEN have been developed, and they provide various functions. Based on the users' needs, different types of inputs can be used. For example, MOLGEN-MS[33] allows users to input mass spectrometry data of an unknown molecule. Compared to many other generators, MOLGEN approaches the problem from different angles. The key feature of MOLGEN is generating structures without building all the intermediate structures and without generating duplicates.

In the field, the studies recent to 2021 are from Kimito Funatsu's research group. As a type of assembly method, building blocks, such as ring systems and atom fragments, are used in the structure generation.[34] Every intermediate structure is extended by adding building blocks in all possible ways. To reduce the number of duplicates, Brendan McKay's canonical path augmentation method is used. To overcome the combinatorial explosion in the generation, applicability domain and ring systems are detected based on inverse QSPR/QSAR analysis.[35] The applicability domain, or target area, is described based on given biological as well as pharmaceutical activity information from QSPR/QSAR.[36] In that study, monotonically changed descriptors (MCD) are used to describe applicability domains. For every extension in intermediate structures, the MCDs are updated. The usage of MCDs reduces the search space in the generation process. In the QSPR/QSAR based structure generation, there is the lack of synthesizability of the generated structures. Usage of retrosynthesis paths in the generation makes the generation process more efficient. For example, a well-known tool called RetroPath[37] is used for molecular structure enumeration and virtual screening based on the given reaction rules.[38] Its core algorithm is a breadth-first method, generating structures by applying reaction rules to each source compound. Structure generation and enumeration are performed based on Brendan McKay's canonical augmentation method. RetroPath 2.0 provides a variety of workflows such as isomer transformation, enumeration, QSAR and metabolomics.

Besides these mathematical structure generation methods, the implementations of neural networks, such as generative autoencoder models,[39][40] are the novel directions of the field.

Structure reduction

Unlike these assembly methods, reduction methods make all the bonds between atom pairs, generating a hypergraph. Then, the size of the graph is reduced with respect to the constraints. First, the existence of substructures in the hypergraph is checked. Unlike assembly methods, the generation tree starts with the hypergraph, and the structures decrease in size at each step. Bonds are deleted based on the substructures. If a substructure is no longer in the hypergraph, the substructure is removed from the constraints. Overlaps in the substructures were also considered due to the hypergraphs. The earliest reduction-based structure generator is COCOA,[41] an exhaustive and recursive bond-removal method. Generated fragments are described as atom-centred fragments to optimize storage, comparable to circular fingerprints[42] and atom signatures.[43] Rather than storing structures, only the list of first neighbours of each atom is stored. The main disadvantage of reduction methods is the massive size of the hypergraphs. Indeed, for molecules with unknown structures, the size of the hyper structure becomes extremely large, resulting in a proportional increase in the run time.

The structure generator GEN[44] by Simona Bohanec combines two tasks: structure assembly and structure reduction. Like COCOA, the initial state of the problem is a hyper structure. Both assembly and reduction methods have advantages and disadvantages, and the GEN tool avoids these disadvantages in the generation step. In other words, structure reduction is efficient when structural constraints are provided, and structure assembly is faster without constraints. First, the useless connections are eliminated, and then the substructures are assembled to build structures. Thus, GEN copes with the constraints in a more efficient way by combining these methods. GEN removes the connections creating the forbidden structures, and then the connection matrices are filled based on substructure information. The method does not accept overlaps among substructures. Once the structure is built in the matrix representation, the saturated molecule is stored in the output list. The COCOA method was further improved and a new generator was built, HOUDINI.[45] It relies on two data structures: a square matrix of compounds representing all bonds in a hyper structure is constructed, and second, substructure representation is used to list atom-centred fragments. In the structure generation, HOUDINI maps all the atom-centred fragments onto the hyper structure.

Mathematical basis

Chemical graphs

Graph representation of the serotonin molecule. (A) Molecular structure of serotonin. (B) Graph representation of the molecule.

In a graph representing a chemical structure, the vertices and edges represent atoms and bonds, respectively. The bond order corresponds to the edge multiplicity, and as a result, chemical graphs are vertex and edge-labelled graphs. A vertex and edge-labelled graph is described as a chemical graph where is the set of vertices, i.e., atoms, and is the set of edges, which represents the bonds.

In graph theory, the degree of a vertex is its number of connections. In a chemical graph, the maximum degree of an atom is its valence, and the maximum number of bonds a chemical element can make. For example, carbon's valence is 4. In a chemical graph, an atom is saturated if it reaches its valence. A graph is connected if there is at least one path between each pair of vertices. Although chemical mixtures[46] are one of the main interests of many chemists, due to the computational explosion, many structure generators output only connected chemical graphs. Thus, the connectivity check is one of the mandatory intermediate steps in structure generation because the aim is to generate fully saturated molecules. A molecule is saturated if all its atoms are saturated.

Symmetry groups for molecular graphs

For a set of elements, a permutation is a rearrangement of these elements.[47] An example is given below:

1 2 3 4 5 6 7 8 9 10 11
4 2 11 6 1 5 8 9 7 10 3

The second line of this table shows a permutation of the first line. The multiplication of permutations, and , is defined as a function composition, as shown below.

The combination of two permutations is also a permutation. A group, , is a set of elements together with an associative binary operation defined on such that the following are true:

  • There is an element in satisfying , for all elements of .
  • For each element of G, there is an element such that is equal to the identity element.

The order of a group is the number of elements in the group. Let us assume is a set of integers. Under the function composition operation, is a symmetry group, the set of all permutations over X. If the size of is , then the order of is . Set systems consist of a finite set and its subsets, called blocks of the set. The set of permutations preserving the set system is used to build the automorphisms of the graph. An automorphism permutes the vertices of a graph; in other words, it maps a graph onto itself. This action is edge-vertex preserving. If is an edge of the graph, , and is a permutation of , then

A permutation of is an automorphism of the graph if

is an element of , if is an element of .

The automorphism group of a graph , denoted , is the set of all automorphisms on . In molecular graphs, canonical labelling and molecular symmetry detection are implementations of automorphism groups. Although there are well known canonical labelling methods in the field, such as InChI[48] and ALATIS,[49] NAUTY is a commonly used software package for automorphism group calculations and canonical labelling.

List of available structure generators

The available software packages and their links are listed below.

Name Link
ASSEMBLE www.upstream.ch/main.html
COCON cocon.nmr.de
DENDRAL CONGEN+GENOA www.softwarepreservation.org/projects/AI/DENDRAL/DENDRAL-CONGEN_GENOA.zip/view
LSD eos.univ-reims.fr/LSD/index_ENG.html
MAYGEN maygenerator.github.io[50]
MOLGEN www.molgen.de
MOLSIG molsig.sourceforge.net
OMG sourceforge.net/p/openmg
PMG sourceforge.net/projects/pmgcoordination
SENECA github.com/steinbeck/seneca
SMOG ccl.net/cca/software/MS-DOS/SMOG
Surge structuregenerator.github.io[51]

See also

References

This article was adapted from the following source under a CC BY 4.0 license (2021) (reviewer reports): Mehmet Aziz Yirik; Christoph Steinbeck (5 January 2021). "Chemical graph generators". PLOS Computational Biology. 17 (1): e1008504. doi:10.1371/JOURNAL.PCBI.1008504. ISSN 1553-734X. PMC 7785115. PMID 33400699. Wikidata Q104747658.

  1. ^ Yirik, Mehmet Aziz; Steinbeck, Christoph (5 January 2021). "Chemical graph generators". PLOS Computational Biology. 17 (1): e1008504. Bibcode:2021PLSCB..17E8504Y. doi:10.1371/journal.pcbi.1008504. PMC 7785115. PMID 33400699.
  2. ^ Bruccoleri RE; Karplus M (1 January 1987). "Prediction of the folding of short polypeptide segments by uniform conformational sampling". Biopolymers. 26 (1): 137–168. doi:10.1002/BIP.360260114. ISSN 0006-3525. PMID 3801593. Wikidata Q69715633.
  3. ^ Sutherland, G. (1967-02-15). "DENDRAL-a computer program for generating and filtering chemical structures". DEPT OF COMPUTER SCIENCE. Stanford University.
  4. ^ Robert K. Lindsay; Bruce G. Buchanan; Edward A. Feigenbaum; Joshua Lederberg (June 1993). "DENDRAL: A case study of the first expert system for scientific hypothesis formation". Artificial Intelligence. 61 (2): 209–261. doi:10.1016/0004-3702(93)90068-M. ISSN 0004-3702. Wikidata Q29387651.
  5. ^ Karina A. Gulyaeva; Irina L. Artemieva (2020). "The Ontological Approach in Organic Chemistry Intelligent System Development". Advances in Intelligent Systems and Computing: 69–78. doi:10.1007/978-981-32-9343-4_7. ISSN 2194-5357. Wikidata Q105092432.
  6. ^ V.V. Serov; M.E. Elyashberg; L.A. Gribov (April 1976). "Mathematical synthesis and analysis of molecular structures". Journal of Molecular Structure. 31 (2): 381–397. doi:10.1016/0022-2860(76)80018-X. ISSN 0022-2860. Wikidata Q99232065.
  7. ^ Faradzev, IA (1978). "Constructive enumeration of combinatorial objects". Colloq. Internat. CNRS. 260: 131–135.
  8. ^ Charles J. Colbourn; Ronald C. Read (1979). "Orderly algorithms for generating restricted classes of graphs". Journal of Graph Theory. 3 (2): 187–195. doi:10.1002/JGT.3190030210. ISSN 0364-9024. Zbl 0404.05051. Wikidata Q99232279.
  9. ^ Grüner, T; Laue, R; Meringer, M; Bayreuth, U (1997). "Algorithms for group actions: Homomorphism principle and orderly generation applied to graphs". DIMACS Series in Discrete Mathematics and Theoretical Computer Science. pp. 113–22.
  10. ^ Hidetsugu. Abe; Peter C. Jurs (September 1975). "Automated chemical structure analysis of organic molecules with a molecular structure generator and pattern recognition techniques". Analytical Chemistry. 47 (11): 1829–1835. doi:10.1021/AC60361A007. ISSN 0003-2700. Wikidata Q99232471.
  11. ^ Shin-ichi Sasaki; Hidetsugu Abe; Yuji Hirota; Yoshiaki Ishida; Yoshihiro Kudo; Shukichi Ochiai; Keiji Saito; Tohru Yamasaki (1 November 1978). "CHEMICS-F: A Computer Program System for Structure Elucidation of Organic Compounds". Journal of Chemical Information and Computer Sciences. 18 (4): 211–222. doi:10.1021/CI60016A007. ISSN 1520-5142. Wikidata Q99233202.
  12. ^ Craig A. Shelley; Morton E. Munk (November 1981). "Case, a computer model of the structure elucidation process". Analytica Chimica Acta. 133 (4): 507–516. doi:10.1016/S0003-2670(01)95416-9. ISSN 0003-2670. Wikidata Q99233261.
  13. ^ Martin Badertscher; Andrew Korytko; Klaus-Peter Schulz; et al. (May 2000). "Assemble 2.0: a structure generator". Chemometrics and Intelligent Laboratory Systems. 51 (1): 73–79. doi:10.1016/S0169-7439(00)00056-3. ISSN 0169-7439. Wikidata Q99233839.
  14. ^ Raymond E. Carhart; Dennis H. Smith; Neil A. B. Gray; James G. Nourse; Carl Djerassi (April 1981). "Applications of artificial intelligence for chemical inference. 37. GENOA: a computer program for structure elucidation utilizing overlapping and alternative substructures". The Journal of Organic Chemistry. 46 (8): 1708–1718. doi:10.1021/JO00321A037. ISSN 0022-3263. Wikidata Q99233344.
  15. ^ H.J. Luinge; J.H. Van Der Maas (June 1990). "AEGIS, an algorithm for the exhaustive generation of irredundant structures". Chemometrics and Intelligent Laboratory Systems. 8 (2): 157–165. doi:10.1016/0169-7439(90)80131-O. ISSN 0169-7439. Wikidata Q99233812.
  16. ^ Christoph Steinbeck (20 September 1996). "LUCY—A Program for Structure Elucidation from NMR Correlation Experiments". Angewandte Chemie International Edition. 35 (17): 1984–1986. doi:10.1002/ANIE.199619841. ISSN 1433-7851. Wikidata Q50368945.
  17. ^ Christoph Steinbeck (November 2001). "SENECA:  A Platform-Independent, Distributed, and Parallel System for Computer-Assisted Structure Elucidation in Organic Chemistry". Journal of Chemical Information and Computer Sciences (in English and English). 41 (6): 1500–1507. doi:10.1021/CI000407N. ISSN 1520-5142. PMID 11749575. Wikidata Q28837910.
  18. ^ Jean-Loup Faulon (January 1996). "Stochastic Generator of Chemical Structure. 2. Using Simulated Annealing To Search the Space of Constitutional Isomers". Journal of Chemical Information and Computer Sciences. 36 (4): 731–740. doi:10.1021/CI950179A. ISSN 1520-5142. Wikidata Q28837961.
  19. ^ Jean-Marc Nuzillard; Massiot Georges (January 1991). "Logic for structure determination". Tetrahedron. 47 (22): 3655–3664. doi:10.1016/S0040-4020(01)80878-4. ISSN 0040-4020. Wikidata Q57818172.
  20. ^ K. A. Blinov; M. E. Elyashberg; S. G. Molodtsov; A. J. Williams; E. R. Martirosian (1 April 2001). "An expert system for automated structure elucidation utilizing 1H-1H, 13C-1H and 15N-1H 2D NMR correlations". Fresenius' Journal of Analytical Chemistry. 369 (7–8): 709–714. doi:10.1007/S002160100757. ISSN 0937-0633. PMID 11371077. Wikidata Q43616194.
  21. ^ Jochen Junker (28 July 2011). "Theoretical NMR correlations based Structure Discussion". Journal of Cheminformatics. 3 (1): 27. doi:10.1186/1758-2946-3-27. ISSN 1758-2946. PMC 3162559. PMID 21797997. Wikidata Q38264559.
  22. ^ Chang-Yu Hu; Lu Xu (November 1994). "Principles for structure generation of organic isomers from molecular formula". Analytica Chimica Acta. 298 (1): 75–85. doi:10.1016/0003-2670(94)90044-2. ISSN 0003-2670. Wikidata Q99233968.
  23. ^ Junfeng Hao; Lu Xu; Changyu Hu (October 2000). "Expert system for elucidation of structures of organic compounds (ESESOC)". Science in China. Series B: Chemistry. 43 (5): 503–515. doi:10.1007/BF02969496. ISSN 1006-9291. Wikidata Q105032775.
  24. ^ Jean-Loup Faulon (1 September 1994). "Stochastic Generator of Chemical Structure. 1. Application to the Structure Elucidation of Large Molecules". Journal of Chemical Information and Computer Sciences. 34 (5): 1204–1218. doi:10.1021/CI00021A031. ISSN 1520-5142. Wikidata Q99233862.
  25. ^ Jean-Loup Faulon; Carla J Churchwell; Donald P Visco (1 May 2003). "The signature molecular descriptor. 2. Enumerating molecules from their extended valence sequences". Journal of Chemical Information and Computer Sciences. 43 (3): 721–734. doi:10.1021/CI020346O. ISSN 1520-5142. PMID 12767130. Wikidata Q52016182.
  26. ^ Julio E Peironcely; Miguel Rojas-Chertó; Davide Fichera; Theo Reijmers; Leon Coulier; Jean-Loup Faulon; Thomas Hankemeier (17 September 2012). "OMG: Open Molecule Generator". Journal of Cheminformatics. 4 (1): 21. doi:10.1186/1758-2946-4-21. ISSN 1758-2946. PMC 3558358. PMID 22985496. Wikidata Q27499209.
  27. ^ Jean Loup Faulon (1 July 1992). "On using graph-equivalent classes for the structure elucidation of large molecules". Journal of Chemical Information and Computer Sciences. 32 (4): 338–348. doi:10.1021/CI00008A013. ISSN 1520-5142. Wikidata Q99233853.
  28. ^ Brendan D. McKay; Adolfo Piperno (January 2014). "Practical graph isomorphism, II". Journal of Symbolic Computation. 60: 94–112. doi:10.1016/J.JSC.2013.09.003. ISSN 0747-7171. Zbl 1394.05079. Wikidata Q99301767.
  29. ^ Yirik, M.A. (2020). "The Benchmark for Structure Generators" – via Blogger.
  30. ^ Mohammad Mahdi Jaghoori; Sung-Shik T.Q. Jongmans; Frank de Boer; Julio Peironcely; Jean-Loup Faulon; Theo Reijmers; Thomas Hankemeier (December 2013). "PMG: Multi-core Metabolite Identification". Electronic Notes in Theoretical Computer Science. 299: 53–60. doi:10.1016/J.ENTCS.2013.11.005. ISSN 1571-0661. Wikidata Q105032974.
  31. ^ M. S. Molchanova; V. V. Shcherbukhin; N. S. Zefirov (January 1996). "Computer Generation of Molecular Structures by the SMOG Program". Journal of Chemical Information and Computer Sciences. 36 (4): 888–899. doi:10.1021/CI950393Z. ISSN 1520-5142. Wikidata Q99233768.
  32. ^ I. P. Bangov; K. D. Kanev (February 1988). "Computer-assisted structure generation from a gross formula: II. Multiple bond unsaturated and cyclic compounds. Employment of fragments". Journal of Mathematical Chemistry. 2 (1): 31–48. doi:10.1007/BF01166467. ISSN 0259-9791. Wikidata Q105033085.
  33. ^ Kerber, A; Laue, R; Meringer, M; Varmuza, K. (2001). "MOLGEN-MS: Evaluation of low resolution electron impact mass spectra with MS classification and exhaustive structure generation". Advances in Mass Spectrometry. pp. 939–940.
  34. ^ Tomoyuki Miyao; Hiromasa Kaneko; Kimito Funatsu (14 June 2016). "Ring system-based chemical graph generation for de novo molecular design". Journal of Computer - Aided Molecular Design. 30 (5): 425–446. doi:10.1007/S10822-016-9916-1. ISSN 0920-654X. PMID 27299746. Wikidata Q50627884.
  35. ^ Tomoyuki Miyao; Hiromasa Kaneko; Kimito Funatsu (26 November 2014). "Ring-System-Based Exhaustive Structure Generation for Inverse-QSPR/QSAR". Molecular Informatics. 33 (11–12): 764–778. doi:10.1002/MINF.201400072. ISSN 1868-1743. PMID 27485423. Wikidata Q39092888.
  36. ^ Tomoyuki Miyao; Masamoto Arakawa; Kimito Funatsu (1 January 2010). "Exhaustive Structure Generation for Inverse-QSPR/QSAR". Molecular Informatics. 29 (1–2): 111–125. doi:10.1002/MINF.200900038. ISSN 1868-1743. PMID 27463853. Wikidata Q51758769.
  37. ^ Baudoin Delépine; Thomas Duigou; Pablo Carbonell; Jean-Loup Faulon (9 December 2017). "RetroPath2.0: A retrosynthesis workflow for metabolic engineers". Metabolic Engineering. 45: 158–170. doi:10.1016/J.YMBEN.2017.12.002. ISSN 1096-7176. PMID 29233745. Wikidata Q47256449.
  38. ^ Mathilde Koch; Thomas Duigou; Pablo Carbonell; Jean-Loup Faulon (19 December 2017). "Molecular structures enumeration and virtual screening in the chemical space with RetroPath2.0". Journal of Cheminformatics. 9 (1): 64. doi:10.1186/S13321-017-0252-9. ISSN 1758-2946. PMC 5736515. PMID 29260340. Wikidata Q47199780.
  39. ^ Artur Kadurin; Sergey Nikolenko; Kuzma Khrabrov; Alex Aliper; Alex Zhavoronkov (13 July 2017). "druGAN: An Advanced Generative Adversarial Autoencoder Model for de Novo Generation of New Molecules with Desired Molecular Properties in Silico". Molecular Pharmaceutics. 14 (9): 3098–3104. doi:10.1021/ACS.MOLPHARMACEUT.7B00346. ISSN 1543-8384. PMID 28703000. Wikidata Q38681438.
  40. ^ Thomas Blaschke; Marcus Olivecrona; Ola Engkvist; Jürgen Bajorath; Hongming Chen (13 December 2017). "Application of Generative Autoencoder in de Novo Molecular Design". Molecular Informatics. 37 (1–2): 1700123. arXiv:1711.07839. doi:10.1002/MINF.201700123. ISSN 1868-1743. PMC 5836887. PMID 29235269. Wikidata Q48127458.
  41. ^ Christie BD; Munk ME (1 May 1988). "Structure generation by reduction: a new strategy for computer-assisted structure elucidation". Journal of Chemical Information and Computer Sciences. 28 (2): 87–93. doi:10.1021/CI00058A009. ISSN 1520-5142. PMID 3392122. Wikidata Q38594392.
  42. ^ Robert C Glem; Andreas Bender; Catrin H Arnby; Lars Carlsson; Scott Boyer; James Smith (1 March 2006). "Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME". IDrugs: the Investigational Drugs Journal. 9 (3): 199–204. ISSN 1369-7056. PMID 16523386. Wikidata Q51947334.
  43. ^ Jean-Loup Faulon; Michael J Collins; Robert D Carr (1 March 2004). "The signature molecular descriptor. 4. Canonizing molecules using extended valence sequences". Journal of Chemical Information and Computer Sciences. 44 (2): 427–436. doi:10.1021/CI0341823. ISSN 1520-5142. PMID 15032522. Wikidata Q45023689.
  44. ^ Simona Bohanec (1 May 1995). "Structure Generation by the Combination of Structure Reduction and Structure Assembly". Journal of Chemical Information and Computer Sciences. 35 (3): 494–503. doi:10.1021/CI00025A017. ISSN 1520-5142. Wikidata Q99233866.
  45. ^ A. Korytko; K-P Schulz; M. S. Madison; Munk ME (1 September 2003). "HOUDINI: A New Approach to Computer-Based Structure Generation". Journal of Chemical Information and Computer Sciences. 43 (5): 1434–1446. doi:10.1021/CI034057R. ISSN 1520-5142. PMID 14502476. Wikidata Q52009004.
  46. ^ G. Massiot; J. M. Nuzillard (July 1992). "Computer-assisted elucidation of structures of natural products". Phytochemical Analysis. 3 (4): 153–159. doi:10.1002/PCA.2800030403. ISSN 0958-0344. Wikidata Q57818162.
  47. ^ Donald Lawson Kreher; Douglas R. Stinson (March 1999). "Combinatorial algorithms: generation, enumeration, and search". ACM SIGACT News. 30 (1): 33–35. doi:10.1145/309739.309744. ISSN 0163-5700. Wikidata Q105033277.
  48. ^ Stephen R Heller; Alan McNaught; Igor Pletnev; Stephen Stein; Dmitrii Tchekhovskoi (2015). "InChI, the IUPAC International Chemical Identifier". Journal of Cheminformatics. 7 (1): 23. doi:10.1186/S13321-015-0068-4. ISSN 1758-2946. PMC 4486400. PMID 26136848. Wikidata Q21146620.
  49. ^ Hesam Dashti; William M Westler; John L Markley; Hamid R Eghbalnia (23 May 2017). "Unique identifiers for small molecules enable rigorous labeling of their atoms". Scientific Data. 4: 170073. doi:10.1038/SDATA.2017.73. ISSN 2052-4463. PMC 5441290. PMID 28534867. Wikidata Q33718167.
  50. ^ Yirik, Mehmet Aziz; Sorokina, Maria; Steinbeck, Christoph (December 2021). "MAYGEN: an open-source chemical structure generator for constitutional isomers based on the orderly generation principle". Journal of Cheminformatics. 13 (1): 48. doi:10.1186/s13321-021-00529-9. PMC 8254276. PMID 34217353.
  51. ^ McKay, Brendan D.; Yirik, Mehmet Aziz; Steinbeck, Christoph (December 2022). "Surge: a fast open-source chemical graph generator". Journal of Cheminformatics. 14 (1): 24. doi:10.1186/s13321-022-00604-9. PMC 9034616. PMID 35461261.

Read other articles:

Johnny Cash's Greatest Hits, Vol. 1grandes éxitos de Johnny CashPublicación Julio de 1967Género(s) CountryDuración 30:46Discográfica Columbia RecordsProductor(es) Don Law, Frank JonesCalificaciones profesionales Allmusic enlace Cronología de Johnny Cash Happiness is You(1966) Johnny Cash's Greatest Hits, Vol. 1 Carryin' on with Johnny Cash and June Carter(1967) [editar datos en Wikidata] Johnny Cash's Greatest Hits, Vol.1 es un álbum de recopilación de los mejores y m

 

Negara gagal adalah negara yang dianggap gagal memenuhi persyaratan dan tanggung jawab dasar suatu pemerintahan berdaulat. Tidak ada kesepakatan umum tentang definisi negara gagal. Definisi negara gagal menurut Fund for Peace sering digunakan untuk mencap suatu negara yang memiliki ciri-ciri berikut: Kehilangan kontrol atas wilayahnya sendiri, atau monopoli pengerahan pasukan fisik sah di wilayahnya Tergerusnya kewenangan yang sah dalam pembuatan keputusan bersama Tidak mampu menyediakan laya...

 

Not to be confused with Margaretha von Waldeck. German countess (1558–1599) Magdalene of Waldeck-WildungenCountess of Hanau-MünzenbergCountess of Nassau-SiegenCoat of armsFull nameMagdalene Countess of Waldeck-WildungenNative nameMagdalena Gräfin zu Waldeck-WildungenBorn1558Died9 September 1599Idstein CastleBuried13 September 1599Fürstengruft [nl], Evangelische Stadtkirche [de], DillenburgReburied: unknown dateSt. Nicholas Church [de], Sie...

Lukáš Rosol Lukáš RosolPaís República Checa República ChecaResidencia Praga, República ChecaFecha de nacimiento 24 de julio de 1985 (38 años)Lugar de nacimiento Brno, ChecoslovaquiaAltura 1,93 m (6′ 4″)Peso 83 kg (183 lb)Entrenador Jaroslav LevinskýProfesional desde 2004Brazo hábil Diestro (Revés a dos manos)Dinero ganado 4 493 341 dólares estadounidensesPerfil oficial Perfil IndividualesRécord de su carrera 121–153Títulos de su carrera 2 ATP...

 

Black Socks redirects here. For other uses, see Black Sox (disambiguation). New Zealand national softball teamInformationCountry New ZealandFederationSoftball New ZealandConfederationWBSC OceaniaManagerThomas MakeaWBSC World Rank 7 (26 April 2023)[1]Men's Softball World ChampionshipAppearances17 (First in 1966)Best result 1st (7 times, most recent in 2017) The New Zealand men's national softball team (nicknamed the Black Sox/Black Socks) is the national softball team for New Zeal...

 

Juma AzbargaFaksi yang diwakili dalam Knesset Informasi pribadiLahir15 Agustus 1956 (umur 67)Lakiya, IsraelSunting kotak info • L • B Juma Azbarga (Arab: جمعة زبارقة, Ibrani: ג'ומעה אזברגה; lahir 15 Agustus 1956) adalah seorang politikus Arab Israel Bedouin. Ia menjabat sebagai anggota Knesset untuk Joint List antara 2017 dan 2019. Pranala luar Juma Azbarga di situs web Knesset

Lego theme Lego Toy StorySubjectToy StoryLicensed fromWalt Disney Pictures and PixarAvailability2009–2010, 2019-PresentTotal sets24+[1]CharactersWoody, Buzz Lightyear, Rex, Hamm, Jessie, Bullseye, Green Alien Men, RC, Green Army Men, Stinky Pete, Zurg, Lotso, Chunk, Stretch, Twitch, Dump truck driver, Bo Peep, Forky, Bunny, Ducky, Gabby Gabby, Duke CaboomOfficial website Lego Toy Story (stylized as LEGO Toy Story) is a Lego theme based on the Disney·Pixar's Toy Story film...

 

Mountain in Bolivia For the mountain in the Oruro Department, see Wayna Potosí (Oruro). Huayna PotosíHighest pointElevation6,088 m (19,974 ft)Prominence1352[1]Parent peakAncohumaCoordinates16°15′45″S 68°09′13.5″W / 16.26250°S 68.153750°W / -16.26250; -68.153750GeographyHuayna PotosíLocation in Bolivia Location BoliviaParent rangeAndesClimbingFirst ascent(First ascent by Europeans): 1919 by Rudolf Dienst and O. Lhose (Germany)&...

 

Indian television series ChutzpahPromotional posterGenreDramaCreated byMrighdeep Singh LambaWritten byAmit Babbar Mrighdeep Singh LambaDirected bySimarpreet SinghStarringTanya ManiktalaElnaaz NorouziVarun SharmaManjot SinghDiksha SinghKshitij ChauhanMusic byKetan SodhaCountry of originIndiaOriginal languageHindiNo. of seasons1No. of episodes7ProductionProducerDinesh VijanCinematographyKabir Tejpal Gianni GiannelliEditorAmit KulkarniRunning time3 Hour 53 minutesProduction companyMaddock Outsid...

Movie theater which screens non-mainstream films The examples and perspective in this article may not represent a worldwide view of the subject. You may improve this article, discuss the issue on the talk page, or create a new article, as appropriate. (September 2019) (Learn how and when to remove this template message) The Little Theatre in Rochester, New York, an example of an indie cinema. An independent movie theater (American English) or indie cinema (British English) is a movie theater ...

 

Village in Syrmia, CroatiaSrijemske Laze Sremske LazeVillage (Selo)Nickname: Little Moscow[1]Srijemske LazeShow map of Vukovar-Syrmia CountySrijemske LazeShow map of CroatiaSrijemske LazeShow map of EuropeCoordinates: 45°13′15″N 18°57′19″E / 45.220934°N 18.955382°E / 45.220934; 18.955382Country CroatiaRegionSyrmia (Podunavlje)County Vukovar-SyrmiaMunicipalityStari JankovciGovernment • BodyLocal CommitteeArea[2]...

 

2009 studio album by AlphabeatThe SpellStudio album by AlphabeatReleased26 October 2009 (2009-10-26)GenreDance-popEuropophousesynth-popLength37:22LabelCopenhagenProducerAnders BTroels HansenRasmus NagelJonas QuantAnders SGMike SpencerRichard Biff StannardAlphabeat chronology Alphabeat(2007) The Spell(2009) Express Non-Stop(2012) Singles from The Spell The SpellReleased: 21 September 2009 DJReleased: 9 January 2010 Hole in My HeartReleased: 21 February 2010 Heat WaveRele...

1956 film For other uses, see The Great Man (disambiguation). The Great ManDirected byJosé FerrerScreenplay byAl MorganJosé FerrerBased onThe Great Man1955 novelby Al MorganProduced byAaron RosenbergStarringJosé Ferrer Dean Jagger Julie LondonCinematographyHarold LipsteinEdited byAlbrecht JosephSherman ToddMusic byHerman SteinProductioncompanyUniversal PicturesDistributed byUniversal PicturesRelease date December 1956 (1956-12) (United States) Running time92 minutesCountryUni...

 

2020 American drama film TigertailOfficial posterDirected byAlan YangWritten byAlan YangProduced by Charles King Kim Roth Poppy Hanks Alan Yang Starring Tzi Ma Christine Ko Hayden Szeto Lee Hong-chi Kunjue Li Fiona Fu Yang Kuei-mei James Saito Joan Chen CinematographyNigel BluckEdited byDaniel HaworthMusic byMichael BrookProductioncompanyMACRODistributed byNetflixRelease date April 10, 2020 (2020-04-10) Running time91 minutesCountryUnited StatesLanguages English Taiwanese Hokki...

 

Newspaper of Nigeria This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages) This article possibly contains original research. From the fourth paragraph to the last paragraph Please improve it by verifying the claims made and adding inline citations. Statements consisting only of original research should be removed. (August 2021) (Learn how and when to remove this template message) This article c...

Part of a series onTaxation An aspect of fiscal policy Policies Government revenue Property tax equalization Tax revenue Non-tax revenue Tax law Tax bracket Flat tax Tax threshold Exemption Credit Deduction Tax shift Tax cut Tax holiday Tax amnesty Tax advantage Tax incentive Tax reform Tax harmonization Tax competition Tax withholding Double taxation Representation Unions Medical savings account Economics General Theory Price effect Excess burden Tax incidence Laffer curve Optimal tax Theori...

 

Croatian handball player Zlatko Horvat Horvat with Croatia at the 2012 Summer OlympicsPersonal informationBorn (1984-09-25) 25 September 1984 (age 39)Zagreb, SR Croatia,SFR YugoslaviaNationality CroatianHeight 1.79 m (5 ft 10 in)Playing position Right wingClub informationCurrent club Dabas KKNumber 17Senior clubsYears Team2002–2020 RK Zagreb2020–2021 RK Metalurg Skopje2022– Dabas KKNational teamYears Team Apps (Gls)2005– Croatia 182 (565) Medal record Olympic Games...

 

Valdespina localidad ValdespinaUbicación de Valdespina en España. ValdespinaUbicación de Valdespina en la provincia de Palencia.País  España• Com. autónoma  Castilla y León• Provincia  Palencia• Municipio AmuscoUbicación 42°07′55″N 4°25′31″O / 42.131944444444, -4.4252777777778• Altitud 838 mPoblación 95 hab. (INE 2020)Código postal 34419[editar datos en Wikidata] Iglesia de Valde...

2006–2007 Marvel Comics crossover storyline This article is about the print comic series. For the film, see Captain America: Civil War. For other uses, see Civil War (disambiguation). This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages) This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be...

 

United Kingdom government non-ministerial department Food Standards AgencyWelsh: Asiantaeth Safonau BwydNon-ministerial government department overviewFormed1 April 2000 (2000-04-01)JurisdictionEngland, Wales and Northern IrelandHeadquartersPetty France,London, SW1[1]Annual budget£159.7 million (2009–2010)[2]Non-ministerial government department executivesSusan Jebb, ChairEmily Miles, CEOWebsitefood.gov.uk The Food Standards Agency is a non-ministerial governm...

 

Strategi Solo vs Squad di Free Fire: Cara Menang Mudah!