Deep Codon Initiative - Quantum Codon Pvt Ltd
The Genome Holds a Secret 98%.We Are Decoding It.
"Dark genome is a treasure house for the next generation of drug discovery molecules."Prof. Pawan K Dhar, CSO, Quantum Codon
For fifty years, biology focused on the 2% of DNA that codes for proteins. Deep Codon systematically unlocks the remaining 98% - non-expressing DNA and non-translating RNA - and converts it into first-in-class therapeutic molecules.

15+
Years of continuous dark genome research, reproducible across labs
6+
Disease areas with proof-of-concept evidence
98%
Genome space historically unmined for therapeutics
22nM
IC50 of tREP-18 against Leishmania
Untapped biological space
The Untapped Scale of the Dark Genome
Every genome - bacterial, yeast, fly, worm, human - contains an overwhelming majority of sequences that have never been expressed as proteins. These are not gaps or errors. They are the unexplored majority of life's coding potential.
Approximate proportions vary by organism and annotation methodology. Class I + Class II form Deep Codon's therapeutic reservoir.
The Deep Codon classification
Two classes of dark genomic matter. One drug discovery canvas.
Deep Codon classifies the unexplored genome into non-expressing DNA and non-translating RNA, each with distinct and complementary therapeutic opportunities.

Non-Expressing DNA Sequences
DNA regions that are present in every cell but are never transcribed into RNA under natural conditions. Evolution chose not to express them - not because they are useless, but because they were never sampled. Deep Codon's synthetic expression platform unlocks this vast, untouched coding reservoir for the first time.
- Intergenic regions - sequences between any two annotated genes. Our 2009 proof-of-concept showed 6/6 randomly selected E. coli intergenic sequences produced stable, functional proteins when synthetically expressed.
- Antisense sequences - complementary strands to coding sequences. Full-length antisense proteins predicted across E. coli (0.7%), S. cerevisiae (0.15%), and D. melanogaster (0.2%) - many with enzymatic activity.
- Reverse ORFs - reading existing coding sequences in the reverse (-1) frame. A completely parallel protein universe derived from every annotated gene.
- Repetitive elements - telomeric repeats, microsatellites, LINE/SINE-derived sequences. An underexplored test bed for novel scaffold motifs and functional diversity.
- Pseudogenes - evolutionary relics: once-active genes now silenced by mutation. Thousands exist across genomes. Synthetic reconstruction shows many pseudogene-derived peptides fold into stable, functional proteins.

Non-Translating RNA Sequences
RNA molecules produced by the cell but never translated into protein throughout evolutionary history. These include the machinery of translation itself - tRNAs, rRNAs - as well as regulatory and structural RNAs that form the cell's hidden information layer. Deep Codon has demonstrated that synthetic translation of these sequences produces biologically active peptides with remarkable therapeutic properties.
- Introns - spliced out during mRNA processing, long considered splicing waste. Our studies show these sequences systematically translate into stable, bioactive peptides and proteins - a hidden layer of functional chemistry.
- tRNA-derived peptides (tREPs) - the most exciting discovery: tREP-18, derived from E. coli tRNA sequences, showed anti-leishmanial activity at IC50 = 22.13 nM while remaining safe for human cells. A completely new class of bioactive molecule.
- Ribosomal RNA (rRNA) - the scaffold of the ribosome itself, never translated throughout all of evolutionary time, offers a unique template for novel functional peptides. Nature's protein-making machinery can itself become a source of new proteins.
- MicroRNA (miRNA) - only ~22 nucleotides, but with remarkable precision. These smallest transcriptome elements may become the most precise tools in peptide engineering when synthetically translated.
- Long non-coding RNA (lncRNA) - hundreds to thousands of bases. Enormous, uncharted protein-coding reservoir. Sheer sequence diversity provides a fertile platform for designer peptides and novel biochemical pathways.
Class I + Class II -> First-in-Class Pathways
By combining Class I proteins and Class II peptides — using domain prediction and molecular docking — Deep Codon can design and construct entirely novel cellular pathways: regulatory, signalling, or metabolic. These pathways do not exist in nature. They emerge from the dark matter of the genome, expressed and orchestrated for the first time.
15 years of proof - 2009 to 2026
Validated results across six disease areas
This is not a hypothesis awaiting validation. The platform has produced biologically active molecules against cancer, malaria, leishmaniasis, Alzheimer's disease, pathogenic microbes, and viral pathogens. — over a sustained, peer-reviewed research programme beginning in 2009.

2009 - Dhar et al - JNU, New Delhi
Proof of Concept - Class IWorld's First Dark Genome Expression
Six E. coli intergenic sequences, none previously expressed, were cloned and synthetically expressed. All six produced stable proteins, with Eka1 causing potent reversible growth inhibition.

2013-2015 - Joshi, Krishnan et al
Anti-Malaria - Class IPlasmodium falciparum Invasion Blocked
Synthetic peptides from S. cerevisiae intergenic sequences were screened against P. falciparum invasion proteins. Docking and cell experiments showed more than 60% inhibition of parasite entry.

2015-2023 - Raj, Verma et al
Alzheimer's - Class IBACE1 Inhibition: 86.7% at 1uM
From 2,500 intergenic sequences and 424 novel peptides, ECOI2 achieved 86.7% BACE1 inhibition and reduced amyloid A beta 1-40 and 1-42 in SH-SY5Y neuroblastoma cells.

2023 - Dhar et al - Published
Anti-Leishmania - Class IIFirst Functional tRNA-Derived Peptide
E. coli tRNAs were computationally translated into tREPs. tREP-18 showed IC50 = 22.13 nM against L. donovani and remained safe for human macrophages.

2024 - Shanthappa et al
Vaccines - Class IItREP-Derived Antiviral Vaccine Epitopes
tRNA-encoded peptides were screened as vaccine epitopes against viral pathogens. RRHIDIVV and IMVRFSAE showed favorable HLA binding and 200 ns molecular dynamics stability.

2016-2023 - Varughese, Garg et al
Enzymes - Class IAntisense and Reverse Protein Landscape
Full-length antisense and reverse proteins were mapped across E. coli, S. cerevisiae, and D. melanogaster, with many candidates predicted to have enzymatic, transporter, or secretory functions.
2009-now
Continuous research program across disease areas and organisms
6+
Disease areas with experimental evidence
86.7%
BACE1 inhibition at 1uM from ECOI2
22nM
IC50 of tREP-18 against L. donovani
The Deep Codon technology platform
From dark genome to validated drug candidate
An integrated pipeline for converting naturally silent genomic sequences into validated therapeutic candidates through bioinformatics, AI prediction, molecular simulation, and experimental validation.
Dark Genome Mapping
Identify Class I and Class II sequences across model organisms and cross-reference against NCBI GEO and NR databases.
AI Prediction
Translate in silico, predict tertiary structure, screen toxicity, and rank stability, solubility, charge, and immunogenicity.
Virtual Screening
Dock dark-genome candidates against kinases, GPCRs, enzymes, viral proteins, and other target classes.
Quantum Simulation
Use molecular dynamics and quantum modules to improve binding, folding, electron distribution, and reaction-energy modeling.
Experimental Validation
Synthesize or express top candidates, then validate through cell assays, Western blot, flow cytometry, and preclinical models.
Artificial Intelligence
Making the Invisible Visible at Scale
AI trained on structural and functional genomic data enables high-throughput prediction of which silent sequences can produce stable, non-toxic, biologically active molecules.
- AlphaFold-based tertiary structure prediction
- Multi-omics integration across genetic, immune, and metabolic data
- ADMET and toxicity screening at genome scale
- Automated candidate prioritization by druggability
Quantum Computing
Simulating Molecular Reality with Precision
Quantum computing modules provide a path toward higher-fidelity modeling of how molecules bind, fold, and react inside complex biological systems.
- Quantum-level electron distribution modeling
- Variational Quantum Eigensolver for electronic structure
- Quantum pattern recognition in high-dimensional data
- Molecular dynamics refined by quantum accuracy
Landmark publication - 2025
The Scientific Foundation
The Deep Codon platform is anchored in 15+ years of published research, culminating in a preprint proposing an integrated AI + quantum framework for dark genome drug discovery.
Preprint - Posted 19 May 2025 - Preprints.org - Biology and Biotechnology
Recoding Genomic Elements with AI and Quantum Computation to Build the Next Generation Drug Discovery Platform
Kadalmani Krishnan - Anita Chugh - Vidya Niranjan - Pawan Kumar Dhar*
DOI: 10.20944/preprints202505.1422.v1
"We propose a next-generation, first-in-class drug discovery platform that harnesses the vast, untapped genomic landscape through the integration of Artificial Intelligence and Quantum Computing."
Verma, Manvati & Dhar (2023). Harnessing Escherichia coli's Dark Genome to Produce Anti-Alzheimer Peptides. ECOI2: 86.7% BACE1 inhibition.
Garg & Dhar (2023a). Repurposing the Dark Genome I: Antisense Proteins. Novel antisense protein landscape.
Nayak & Dhar (2023). Repurposing the Dark Genome II - Reverse Proteins. Reverse ORF therapeutic potential.
Garg & Dhar (2023b). Repurposing The Dark Genome III - Intronic Proteins. Intron-derived peptide bioactivity.
Shanthappa et al (2024). tREP-Derived Antiviral Vaccine Epitopes. RRHIDIVV + IMVRFSAE: 200 ns MD stable.
Investor brief - Quantum Codon Pvt Ltd
The last great frontier in drug discovery is inside our own genomes.
The global pharmaceutical industry spends USD 2.6 trillion annually on R&D with a 90%+ failure rate. The structural reason: it is mining only 1-2% of available biological space.
Deep Codon has built a platform from Class I non-expressing DNA and Class II non-translating RNA sequences. Every intergenic region, antisense strand, tRNA, and pseudogene becomes part of the competitive moat.
15 Years of Proprietary Science
Reproducible, published research across 6 disease areas creates a long-duration scientific advantage.
Amaravati Quantum Valley Anchor
Quantum integration is already architected into the discovery pipeline as infrastructure matures.
No Competitive Platform Exists
The non-expressing and non-translating genomic space remains largely unexplored territory.
Partner with Quantum Codon
The genome's most important medicines are yet to be discovered.
Deep Codon is the platform built to find them in the 98% of the genome that science has barely explored. Join us at the frontier of next-generation drug discovery.
