Chromosome selection
This is all speculation from a lay perspective. The idea is to make an embryo with a genome selected chromosome by chromosome from some input genomes.
1. The point
Chromosome selection might potentially be significantly faster and cheaper than iterated recombinant selection (Gwern on IES, Metacelsus's iterated meiosis scheme) by avoiding many hundreds or thousands of genome sequencings and many rounds of meiosis and combination (fertilization, electrofusion), instead (optimistically) using on the order of a few hundred operations (depending on the details) and only a few genome sequencings. Chromosome selection would also potentially synergize quite strongly with iterated meiotic selection or iterated embryo selection, combining the power of recombination with the efficiency of assembling sets of chromosomes.
Chromosome selection is limited compared to iterated recombinant selection because there is no chromosomal recombination to mix and match segments from different chromosomes of the same type, but there is (on some assumptions about genetics of traits) a significant amount of variance already among whole chromosomes. Estimates maybe forthcoming, see appendix for flawed estimates. [Update: see this post for estimates.]
2. Short version
- Sequence some genomes.
- Select some chromosomes from those genomes.
- Extract those chromosomes from cells.
- Create an embryo that incorporates those chromosomes.
How might we do steps 3 and 4? What are the hardest parts?
3. Update January 2023: existing research
Chromosome transplantation
Someone points out this paper "Chromosome Transplantation: A Possible Approach to Treat Human X-linked Disorders", Paulis et al. 2020 demonstrating X-chromosome transplantation in human iPSCs. Cool! How can this be extended?
Whole cell fusion
Another major idea: whole cell fusion combines two cells. Inducing a fused tetraploid cell to undergo mitosis produces, with reasonably high yield, diploid cells with genomes that are random subsets of the tetraploid cell. By iteratively fusing, dividing, and selecting cells, a sort of "poor man's chromosome selection" can be implemented.
Whole cell fusion seems to be a standard method. Ploidy reduction (i.e. going from tetraploid cells, 4 of each index of chromosome, to diploid cells, with 2 of each index of chromosome) has been shown to happen naturally in mouse cells: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6795515/. One guesses that it wouldn't be too hard to induce the same in a cultured cell line.
Abstractly, this is somewhat analogous to the effects on a single chromosome of iterated meiotic selection with hyperrecombination. The analogy is like this:
_____________________________________________________________________________________
| meiotic selection, hyperrecombination | poor man's chromosome selection |
| cell fusion, haploid to diploid | cell fusion, diploid to tetraploid |
| meiosis, diploid to haploid | mitosis, tetraploid to sometimes diploid |
| one chromosome | whole diploid genome |
| 46 crossovers per round | 46 of 92 chromosomes passed on each round |
| 1 of 2 1/46 segments passed on | 2 of 4 chromosomes passed on |
————————————————————————————————————————————————————————————————————————————————————-
The analogy isn't perfect because the ploidy reduction passes on a random subset of 2 of 4 homologous chromosomes, whereas meiosis passes on 1 of 2 homologous segments. But I think the analogy is close enough that the analysis described here should very roughly still apply: https://tsvibt.blogspot.com/2022/10/technical-pathways-to-strong-genomic.html#iterated-meiotic-selection-ims
What's cool about this method is that it seems to already be feasible, and to be fairly simply to implement. A drawback is that it can only produce diploid cells.
4. Update September 2022: current thoughts on step 3
- Prepare the chromosomes.
- Wait until anaphase.
- Bind oligonucleotides attached to magnetic beads to sites on target chromosomes.
- Somehow chemically protect the chromosomes from mechanical stresses. E.g. maybe coat the chromosomes in some protein that encases them so that the DNA floats in some cytoplasm, or maybe bind the chromosomes up into a ball by randomly connecting different sites so that the chromosome is more ball-shaped and will experience less shear (less difference in acceleration at different points).
- Dissolve the cell walls.
- Run the stuff over a magnetic array.
- Do this as gently as possible to avoid mechanical stress. E.g. use a viscous, slow-moving medium.
- Make the environment less damaging, e.g. cold to prevent thermal damage, sheilded from UV, low harmful ions.
- Use nuclear extract as the medium (maybe including extra DNA repair proteins?).
- Somehow isolate individual chromosomes in liposomes, remove the magnetic beads, and assemble the chromosomes into a cell.
I don't know about how DNA responds to mechanical stresses, and what mechanical stresses would be present in this protocol.
If mechanical stresses can't be gotten around, maybe the next best method would be to use CRISPR to cut up all the non-target chromosomes. IIUC, off-target events have a rate about 1%. So cutting all non-target chromosomes at 10 places each would take about 440 target sites, which would succeed with 0 cuttings on the target chromosome with probability $\approx 1/e^{4.4} \approx .01$, which might be a sufficient yield. (This is a very rough estimate as it assumes off-target events would hit the target chromosome, and ignores that off-target events from the target chromosome's homologous chromosome might be much more likely. Also since we don't have specific target sites, we might be able to choose the target cut sequence to be less likely to collide with sites on the target chromosome.)
5. Update October 2022: current thoughts on step 4
The issue with step 4, deriving an embryo from a set of extracted chromosomes, is the epigenomics. Normally during gametogenesis and fertilization the germ-line DNA is modified or "reset" so that when the embryo starts growing, the DNA (genome + epigenome) is totipotent for the first stages of embryogenesis. Chromosome extraction as discussed in the rest of this post works with large samples of somatic cells. Somatic cell DNA doesn't have the right epigenomic markers it would need to straightforwardly generate a healthy embryo.
So the question is, how to get DNA with the right epigenomic markers? One way would be to modify the epigenome by directly applying artificially procured transcription factors. This requires finding enough of the right transcription factors, which aren't well-understood yet.
But another very speculative cluster of ideas would be to piggyback off the normal reproductive process, which "already knows" how to epigenomically prepare DNA for embryogenesis.
Idea 1: reseting somatic DNA with oocytes
Use the environment of the oocyte to reset DNA. That is, fertilize an oocyte with a somatic DNA haploid genome H. The resulting zygote will probably not develop into a healthy embryo, but it might divide a few times. Let it divide, and then extract the chromosomes H. Now H has been reset to some extent. (This is how animal cloning works; after somatic cell nuclear transfer (Wiki) to an oocyte, the oocyte reprograms the somatic DNA well enough to sometimes produce a viable organism.) Possibly repeat this a few times, recycling H back into a new oocyte. Do this also with another haploid genome H'. Then combine H and H' in an enucleated oocyte to form an embryo that's completely genomically selected.
This probably doesn't work, though I don't know why. I have no idea whether this would correctly epigenomically reset the DNA, especially not well enough to serve as a full genome for a healthy embryo. Also, these "pseudo-fertilized" oocytes might not divide very much, which might make it hard to re-extract H, depending on the yield of whatever chromosome extraction method is being used.
Idea 2: embryonic chromosomes
Instead of targeting chromosomes from the parents, produce some embryos from the parents by IVF. Let these embryos grow a little bit, perhaps to somewhere in the blastocyst stage. Sequence them, and extract a desired full genome. Use that full genome to fertilize an enucleated oocyte. Since this DNA has gone through the normal reproductive process, and has only gone through "some small amount" of change since it was zygotic DNA, it ought to be epigenomically similar to zygotic DNA. Maybe that means it can be used to produce a healthy embryo.
I have no idea if something like this might work. I'm curious to know why it wouldn't work. Apparently to say that cells are "totipotent" is ambiguous, and it may be that even at the morula stage, the DNA is already too much modified to necessarily produce a healthy embryo if used on its own.
Idea 3: twinning
Assuming that the two blastomeres after the first cleavage of the zygote are still truly totipotent, maybe one could repeatedly separate these initial blastomeres, creating many totipotent zygotes. That might solve the yield problem.
Idea 4: iterated programmed chromosome fusion
(This isn't addressing embryogenesis but may as well go here.)
There's been recent success in programmed chromosome fusion; see the paper here: https://www.science.org/doi/10.1126/science.abm1964 and a description here: https://denovo.substack.com/p/mice-are-the-new-yeast. Using this technique, it might hypothetically be possible to iteratively induce breaks at target sites on homologous chromosomes in early embryos, hope that they repair to produce a desired crossed-over chromosome, extract that chromosome, and then repeat with a new (possibly twinned) embryo, targeting a new site. This is very speculative, but if it works, it might be a much faster, cheaper, and more effective method of GS than any other except WGS.
6. Detailed version and questions about implementation
(All the implementation ideas are basically pure lay speculation.)
For each of these ideas (focusing on the ones that are most plausible), I wonder:
- Can it work? What are the challenges to making it work?
- Has some version of it already been done? What was the yield and the cost? What would be the likely yield and cost if optimized?
- Does it damage the chromosomes being selected? Can the damaged be reversed or repaired, e.g. un-CRISPRing segments that have been CRISPRed in?
1. Sequence some somatic cells, obtaining their genomes.
(This is known how to do relatively inexpensively.)
2. Select a full complement of 46 chromosomes out of those sequenced.
(This is a separate topic, dealing with polygenic scores and the genotype-phenotype function.)
3. Isolate known chromosomes.
For each selected chromosome C, make it so that C can be manipulated separately from the other chromosomes in its cell, without damaging C, and while knowing that it's C as opposed to another chromosome.
This probably would involve arresting mitosis in early prophase or in anaphase, so that the chromosomes are condensed, unentangled, and not made of joined twin chromatids. Some speculative ideas for addressing some or all of these steps (ELISA and FACS seeming intuitively best):
- CRISPR GFP. Find a sequence that's unique to C, which should exist since C will have polymorphisms with the other chromosome of the same number. CRISPR in a GFP at that sequence.
- ELISA. CRISPR in a "velcro loops" patch to C (in many copies of the cell). Attach matching "velcro hooks" bodies to the assay plate. Dissolve the cells and wash the solution over the plate. Recover C, which is attached to the plate. (This is an improvement of my silly previous proposal to mechanically pull C out of the cell by inserting a strand with the "velcro hooks" attached to one end, and then pulling on the other end of the strand once the "velcro" has bonded.)
- FACS. CRISPR a GFP into C, or use FISH. Then dissolve the cell membrane. Then use FACS, except with chromosomes instead of cells (e.g. by surrounding individual chromosomes with small lipid membranes).
- Visual identification. Just look in a microscope and locate C, maybe by size.
- Visual identification with CRISPR. CRISPR in a GFP, or use FISH, then use the GFP to visually identify C.
- Tweezers, suction. Mechanically remove chromosomes, e.g. by dissolving the cell membranes and then using micro-tweezers or micro-suction.
- Complementary dissolution. Somehow target all chromosomes but C to be dissolved. Then you have a (no longer viable) cell with just C.
- Complementary deactivation. Somehow deactivate or disrupt all chromosomes other than C. E.g., bind something to the middle of each chromosome a protein that prevents centromeres from attaching to the metaphase plate or to the spindle fibers. Then, after mitosis, you hopefully have two cells each with a nucleus containing only C, and a random selection of extranuclear chromosomes. Then extract the nucleus. This might, though, just give you a somewhat random selection of nuclear chromosomes.
- Bacteria. Dissolve the cells. Mix in bacteria that will uptake the chromosomes and copy them during their bacterial mitosis. Select about 100 bacteria. (Each set of 46 random ones has a .63 chance of having at least one with C.) Isolate and culture them individually. Sequence samples from each culture. Find one with C. Now you have a culture of bacteria with C being the only human chromosome in them. This strategy may be costly if there are many input genomes, because it takes on the order of 46² many sequencings (though the "genomes" sequenced are roughly 1/46th as long on average). If other techniques are partially successful, then using bacteria wouldn't be so costly, e.g. if you can visually select the right type of chromosome but not the exact homologous chromosome then you only need on the order of 2×46 sequencings.
4. Create an embryo that develops into an organism with a genome composed of the selected chromosomes.
I haven't thought much about this step. In part that's because I don't know enough details about what has to happen in an ovum for it to start developing into an embryo (e.g., what exactly does the sperm have to do), or in general about techniques for manipulating cells. In part it's because Metacelsus linked this paper on electrofusion, and it seems like if that works, then one should be able to do the same thing with isolated chromosomes (perhaps wrapped in their own little lipid bilayer). Maybe deactivating or dissolving chromosomes (which are AFAIK fictional techniques) plus electrofusion could be used to replace the genome of gametes or of a zygote.
7. Appendix: note on strength of chromosome selection
[Update: see this post for more accurate estimates and fuller explanations.] My current estimates are based on the assumption that chromosomes are of equal length, which is far from the case; on the assumption that the input genomes are drawn randomly, rather than already being selected for some traits; and on the assumption that traits are linear in number of alleles at a single locus, which may be far from the case (e.g. if heterozygosity is desirable for rare alleles that are highly deleterious if homozygous). There may be other false assumptions, such as that traits are controlled by genes roughly evenly distributed across chromosomes. On these assumptions, and assuming that we're selecting the entire genome, I estimate that selecting from just two random input genomes, you get more than 4 standard deviations over the mean. Which is quite a lot. And, the effects are much larger with more input genomes (though with rapidly diminishing returns), e.g. +11 SD for 10 starting genomes, and +14 SD from 30 starting genomes.
A short heuristic argument that gains are fairly large (which can be cached out more precisely and checked with sampling), still relying on the above assumptions: chromosomes have 1/46 the variance of the whole genome. That is, they have standard deviation equal to 1/√46. The most extreme chromosome out of a pool of N will be roughly Gaussian_CDF⁻¹(1/N) of its own standard deviations out, i.e. Gaussian_CDF⁻¹(1/N)/√46 out in terms of the whole genome's standard deviation. The second most extreme chromosome will be almost as extreme as the most extreme chromosome, especially for a large input set. So if N is large enough that 1/N is as rare as α SDs, then the selected genome will be something like 46α/√46, roughly 6.8α SDs. So selecting from 6 genomes gives you 6.8 SDs, and from 40 gives you 13.6 SDs.
Another flaw here is that selecting this way is very deterministic. For resulting genomes to be varied, you'd need to have a larger input set, or to select for different traits.