About PubChemLite
PubChemLite is a dynamic collection of chemicals compiled from major categories of the PubChem Compound Table of Contents (TOC) Browser. This collapses the >100M PubChem database from >1000 sources into a compact selection for efficient and interpretable non-target screening studies based on annotation content from authoritative sources.
PubChemLite includes compounds relevant for environmental (yellow), metabolomics (purple) and exposomics (dark orange). Any compound with annotation content under these categories is included. Multiple forms (salts, stereoisomers) are collapsed to their corresponding neutral component and captured as “related CIDs” (CID = PubChem Compound IDentifier). Annotation, patent and literature counts are summed over all related CIDs. Structure (SMILES, InChI, InChIKey), mass and XlogP values belong to the neutral (parent) CID. CCS values are calculated on adducts of the neutral form. Annotation content can be browsed for individual CIDs on PubChem (e.g., parathion); related CIDs can be searched via batch query (e.g., diclofenac).
PubChemLite is compiled weekly and archived monthly on Zenodo (DOI: 10.5281/zenodo.5995885, with CCS at DOI: 10.5281/zenodo.4081056). The code for the build system, inputs and this interface are available on GitLab. Further details in Schymanski et al. (2021) DOI: 10.1186/s13321-021-00489-0 and Elapavalore et al. (2025) DOI: 10.1021/acs.estlett.4c01003
You can also use PubChemLite directly in MetFrag, via the "Local Databases" option.
Current version: PubChemLite-CCSbase-20250303 (Published March 03, 2025)
