Search This Blog


Thursday, September 27, 2012

Jtest K14 - the Eurogenes Ashkenazi ancestry test

Update 27/11/2013: I've made the new K13 the default Eurogenes admix test at GEDmatch. It seems to hit the spot for most people. See here.

Update 07/10/2013: An upgraded version of the popular EUtest is now available at GEDmatch. See here.


I recently learned that the new Ancestry Painting at 23andMe will include an Ashkenazi reference group. To be honest, I’m not sure there’s much value in using a genetically bottlenecked population of varied biogeographical origins as a reference in such things. Indeed, the Ashkenazi mainly descend from a few hundred founders, but carry Central European, Eastern European, Middle Eastern, African and probably many other admixtures, as evidenced by their genome-wide and uniparental markers.

That’s quite a problem, because due to their relative inbreeding, they produce strong ancestral clusters in many analyses, like in ADMIXTURE runs. However, these clusters are made up of allele frequencies from a wide range of sources and, paradoxically, it’s the relatively more outbred populations which contributed to the Ashkenazi gene pool at its formative stages that often end up showing Ashkenazi admixture in such tests, despite not having any. I've seen this happen regularly in my experiments with ADMIXTURE and STRUCTURE, and I'm pretty sure I could find an example in a peer reviewed study if I tried.

That’s just how things work with the algorithms we have available to run these sorts of tests. Nevertheless, since 23andMe is incorporating an Ashkenazi cluster into its new painting, I thought I’d try and come up with an Ashkenazi ancestry test to perhaps get a rough idea of what we might expect. I'm using ADMIXTURE in supervised mode, and basically trying to recreate clusters that have shown up in a variety of fine-scale analyses, including my ChromoPainter run of Northern European samples. It’s still a work in progress, but below are links to files that many of you might find useful..

Jtest K14 files

Jtest averages for selected populations

EUtest K13 files

EUtest averages for selected populations

The Jtest folder contains files that can be used to make an Ashkenazi ancestry test/chromosome painting with 14 Eurasian and African clusters. The EUtest folder contains the same files, except that the Ashkenazi allele frequencies have been removed. It’s useful to cross check results from both tests, mainly to see what’s hiding under the Ashkenazi admixture if it shows up in the Jtest.

Based on a few test runs today, I’d say that the noise level for the continental clusters is much less than 1%. But it rises to a few per cent for the intra-West Eurasian clusters. In other words, if you’re European, then you might score something like 0.02% in the Sub-Saharan cluster, which basically means 0%. However, you might get around 2% in the Middle Eastern cluster, even though you’re from Central Europe, and you don’t have any recent Middle Eastern ancestry. You can blame various prehistoric and historic migrations into Europe for these seemingly quirky results, and also the fact that Mesolithic Europeans were significantly Eurasian (i.e. Siberian, Amerindian and South Asian-like).

The Ashkenazi cluster is very similar to the Middle Eastern cluster in that regard. So anyone who gets an Ashkenazi score of around 2-3% either has very distant Jewish ancestry or, more likely, none at all. However, those who show more than 25% membership in that cluster are almost certainly of fully Ashkenazi ancestry, and their genomes peppered with Ashkenazi-specific chromosomal segments.

There’s really not much difference between 2% and 25%, you might say. In fact, there is if we say there is. As always, the main thing to remember is that these clusters don’t really exist, because genetic variation is clinal, so the cluster names are basically arbitrary and it’s always the relative results that matter. That’s why to really understand what your scores mean, you need to compare them with those of other users.

Obviously, it's best to compare with people from the same ethnic and/or regional groups. If the Ashkenazi + East Med scores look relatively inflated, that's a sign of recent Ashkenazi ancestry.

Feel free to use the files above for anything you want, except commercial stuff. Please note, I make no guarantees that they’ll provide accurate results for everyone. I might update this post early next week with new and/or additional files and more tips.


Update 6/10/2012: The Jtest K14 and EUtest K13 will soon be available at GEDmatch, accompanied by an "Oracle" population matching analysis and maybe even a 3D genetic map. If all goes to plan, the population matching test should be able to give a decisive yay or nay to anyone wondering whether they have recent Ashkenazi ancestry.

By the way, below is a PCA based on the Jtest averages for selected populations. It was produced by one of my project members so that we could check the reliability of the 14 "ancestral" components. The samples were classified into clusters based on their highest peaking component. So, for instance, the Scots are in the light blue Atlantic cluster, along with French Basques, because the Atlantic component dominates in both groups. However, overall, they're more similar to other samples than to each other.

As per above, the plan is that GEDmatch will soon offer a 3D genetic map based on the loadings from this PCA analysis.

Update 11/10/2012: The Jtest and EUtest are now on offer at GEDmatch. The quickest way to get there is via this link to the Ad-Mix page. Then, from the drop down menus, choose Eurogenes, followed by Jtest.

First run the Admix test to check whether your Ashkenazi admixture is significantly higher than expected for your part of the world (as per above, Jtest averages for selected populations are available here). Then move on to the Oracle analysis by pressing the relevant button at the bottom of the page.

If your Ashkenazi admixture is clearly elevated, and the top 20 single and/or mixed mode Oracle results show AJ (Ashkenazi Jews) as one of your potential matches, then it’s likely you have recent Ashkenazi ancestry.

Whether that’s the case or not, you can then move on to the Chromosome Painting feature to see where the potential Ashkenazi admixture is located in your genome. It’s useful to cross check the results with those from the Ancestry Finder at 23andMe to assess their accuracy.

As already mentioned, the EUtest is exactly the same as the Jtest, but with the Ashkenazi allele frequencies taken out. You can use this option to see what’s hiding under your Ashkenazi admixture in the Jtest. To compare your results with those of selected populations from Europe, Asia and Africa, refer to the EUtest averages sheet.

Please note: it's important to interpret the results with insight. You need to learn how the system works, pay attention to the types of populations that appear in your results, consider carefully why they might be paired with other populations, and of course study the statistics in detail. Expecting a bullseye classification at the top of the Oracle list is likely to lead to major disappointment for many people, simply because I don't have enough samples to represent all of the substructures that exist around the world, especially within countries.

I’ll try and update both tests in a few weeks, after seeing how successful the whole set up is at predicting Ashkenazi admixture and locating it in the genome. One of the main goals will be to improve the accuracy of the Oracle analysis for everyone, including New World people with Amerindian admixture.

Update 21/10/2012: Below are spatial maps of a few of the ancestral clusters from the Jtest, courtesy of project member FR7.

Update 4/12/2012: The Jtest and EUtest at GEDmatch now include a new tool called the 4-Ancestors Oracle (aka. Oracle-4), as well as the 3D PCAs I promised earlier. Oracle-4 will attempt to pinpoint your ethnic group of origin, and then also work out the most likely combinations of two, three and four ancestral populations which make up your genome. However, this doesn't mean the results will actually show your ethnic group, or those of your parents (in dual mode) or grandparents (4-way mode). They might for many people, but for others they'll reflect the best possible outcomes from the reference samples available.

Enjoy, and feel free to give feedback to John at GEDmatch if you think it might be useful (but please don't spam his account).