# Expanding the alphabet of Life

Friends,

Imagine this: you go to a strange island and find that the inhabitants have their own language. You visit their library and you find millions of books written in their language. Amazed by the size of the library and the number of books you randomly pick up a book and open it. You see something very strange: There are only 4 symbols in that book: a, t, c and g. You are also astonished to see that the letter “a” always occurs next to “t” and “c” only next to “g”. Then you pick up another book and you see the same thing. The whole library consisting of several million books has been written in a language consisting of only 4 letters and with strict restrictions of what letter can occur next to which one. You then wonder that if this amazing civilization can create so much literature out of only 4 characters, what would happen if more letters can be introduced into their alphabet. You rush out of the library to try and convince the chief of the island to include a few more letters. You are confident that the literature produced by the people of this island would be massively enriched if they expand their alphabet.

Now you may wonder why I want you to imagine such an island and you may say that it is silly to think of millions of books written using only 4 letters. But friends, the story may be fictional but its core theme is very closely related a very important true story: the story of life itself. The DNA of every living organism in the world consists of 4 nucleobases: Adenine(A), Thymine(T), Cytosine(C) and Guanine(G). So one can say that the information in a DNA is coded in terms of these 4 nucleobases. As we all know: a DNA consists of 2 strands.  You can think of these strands as sequences of nucleobases( for simplicity we will use the word bases for the rest of the article). Each base in one DNA strand is bound to a base in the other strand using chemical bonds. But the bonding between a base from one strand and that on another cannot be between any two bases: Thymine can only bind to Adenine and Cytosine can only bind to Guanine. So if one strand contains the sequence ATTCGG the second strand must contain the sequence TAAGCC.

So we see that the incredible amount of information found in the DNA of each organism that lives or ever lived on Earth is coded using only 4 letters: A, T, C and G and the letters can only occur in fixed ways: A can only occur next to T and C and only occur next to G. So the situation is very similar to the imaginary island and the incredibly interesting library that we discussed earlier. And now comes the real topic of this VERITAS: Can we increase the alphabet in the library of life and create even more complex and interesting books( organisms). Can we expand the vocabulary of life and create new “words” that benefit us? Are the bases in DNA special or can we make new ones(artificial ones)? And most importantly, if we add new letters to the alphabet of life will we be interfering with nature? There are two important properties of DNA bases/base pairs that artificial bases must have:

• The bonds between A-T and C-G base pairs are such that the distance between the two strands of the DNA are always constant. If the A-T bond pulled the DNA strands closer than the C-G bond, the two strands could touch each other and this would be very complicated for replication.
• A DNA must be able to replicate. DNA replication is the basis of all biological inheritance. Briefly this is how it works:  The two strands of the DNA are unwound. Each strand of the DNA now acts as the template for the creation of a new “partner” strand. So the two strands of a DNA are able to create two identical DNA double helix by first unwinding the strands and then synthesizing new strands. To give you a simplified example: Lets take the DNA strand from the earlier example that has the following base sequence: ATTCGG. The second strand would have TAAGCC. Before replication the two are connected like this :

A-T

T-A

T-A

C-G

G-C

G-C

During the replication process the two strands are first separated and so we get ATTCGG and TAAGCC as two separate unconnected strands. And then enzymes synthesize a new TAAGCC to couple with the first and a new ATTCGG to couple with the second. So we will get two DNAs which are identical to the original DNA.

So if we create new base pairs they must be able to maintain the same distance between the DNA strands as A-T and C-G. Also enzymes should be able to properly replicate them. Now replication can be challenging because the enzymes need to synthesize new bases to match the original ones. Enzymes know how to synthesize A, T, C and G. But how will we “teach” them to make the new bases that we introduce?

Scientists have been trying to create artificial bases for over 20 years now. After many years of failure scientists were able to make bases which could be placed alongside the natural bases in a DNA. Many bases and base pairs were created, some more successful than others. The best performing base pairs consisted of two mulecules called d5SICS and dNAM- Lets give them a short name: K and L ( in the original paper scientists called them X and Y but I don’t want you to confuse them with X and Y chromosomes. So I am using K and L). Now, creating a molecule that bonds with another molecule in a lab is one thing. But we want to do something far more complex- we want to add to the alphabet of inheritance and thus of life itself. So the “real” experiment needs to be done in a real cell.

And a real experiment was conducted recently. Scientists added K and L pairs to the DNA of a E. Coli bacteria. Now adding base pairs to a living cell is a challenge in itself. For this scientists relied on a “transporter” which they obtained from some algae. So the “transporter” is responsible for bringing artificial bases into the cell. Once the artificial bases were in place the DNA of the E. Coli had a mix of natural and artificial bases. So, for example, the E.Coli could have ATTKLCGG in one strand and TAALKGCC in the other( K only binds to L just like T only binds to A and C only to G). This was a major breakthrough! We had a living cell with some natural and some artificial bases in the DNA.

But the biggest success was still to come. Scientists were astonished to find that when the DNA was replicated, not only were the natural bases faithfully recreated, but the artificial bases were also replicated by the enzymes! So once you get the artificial bases into the DNA the enzymes can replicate the DNA properly and the artificial base pairs will go to future generations also. Scientists observed that the artificial bases were carried intact generation after generation of the bacteria.

So this really is a very exciting breakthrough. We can expand the alphabet of life! And we can expand it in such a way that it benefits us. When we add information to DNA we can add it in such a way that it helps us- we could create new vaccines, we could create new drugs, we could destroy certain diseases- the possibilities are endless. Of course, knowledge without wisdom can be a dangerous thing. So if we are to change the basic structures of life itself, we better do it carefully and with the intention of improving the beauty and diversity of life and not solely for profit.

Regards

Kanwar

============= ============ =================== ============== ========= =====

Go wondrous creature, mount where science guides

go measure earth, weigh air, state the tides,

instruct the planets in what orbs to run

correct old time, regulate the sun

====== ======= =========== ============================== =============