Viewing a single comment thread. View all comments

Sulstice2 OP t1_ix5hq64 wrote

Hello,

Website & Mobile Friendly: https://sulstice.github.io/Faith/global_chem/index.html

I sampled the most commonly recorded chemicals across different sub-communities to understand what are the most common atoms and what together in pairs are the most common. Different communities meaning different classes of chemicals (Cannabis, Things used in Sex Products, Toxic Agents used in War, Food Colour additives, Materials, Cosmetics, Birth Control etc.)

https://github.com/Sulstice/global-chem/blob/development/global_chem/GlobalChem_Dictionary%20(1).pdf

In the chord diagram above, each node is an atom type that exists within the dataset and each link is a bond between the atom type. The thickness of the line correlates to how many of those particular atom types exist together. The Pink correlates to how much two different hydrogens exist and and the Blue represents a hydrogen and carbon. The rest of the plot is colored light grey.

Next what I did is pass them through something called the CHARMM ForceField which has a language where you can declare different types of atoms like an alkane vs an aromatic. If you see the plot I am highlighting HGA1, HGR62, these are methyl hydrogens and benzene hydrogens in our language.

That data is available here, feel free to play around with it:

https://raw.githubusercontent.com/Sulstice/Faith/main/global_chem/atom_type_group_new.json

Still a work a progress as I get it ready for the PyData Global. I think there are some bugs. The code is here:

https://github.com/Sulstice/Faith/blob/main/global_chem/index.html

6