Up a LevelClick the arrow to go up a level to the DNA & Family Traits page.
Follow the above link or click the graphic below to visit the Homepage.

HomepageDNA Results
Cullens of Upton
Jim Cullen


DNA Results for the Cullens of Upton, Nottinghamshire


NEWS: Recently another Cullen descended from the Upton, Nottinghamshire line agreed to a Y-DNA test at the 37-marker level. The DNA results of a Chris Cullen of Devon, England indicate a near perfect match between our 37-marker haplotypes, putting our estimated time to a common ancestor well within the historical period and most likely early in the Upton family line as was suspected. Chris' results have provided much needed confirmation of our shared DNA signature and also indicate that some unique DNA markers may be able to identify Cullens of this group specifically. Further details are available below.

To date, only two Cullens descended from the Upton, Nottinghamshire line of Cullens have been DNA tested. There is still the need for confirmation of the family's genetic signature by having other Cullens, not closely related to me but still descended from the same line of Cullens, to be DNA tested also. If our signature is correct, your results will be very similar to mine but off on maybe two of the values in the tables below. It would also be VERY interesting to see results of DNA tests on Cullens descended from the Manorhamilton, Co Leitrim line. A connection between these two families has long been suspected and now we have the ability to actually make a strong case for or against the connection. If anyone is at all able to do so, please contact Bernie Cullen or myself with any questions you may have about DNA testing. For anyone descended from the Cullens of Co Leitrim there is a special incentive. I'm so keen on seeing the results that, if you're the first to agree to a 37-marker test, I'll split you 50/50 on the cost. Contact me for details.

I highly recomend, at the very least, a 25-marker test! Current testing and analysis on Y-STR haplotypes, along with the recognized subclades and other information available, indicates that a 37-marker test is the most useful test economically. Based on the Cullen results obtained so far, your results are very likely to be some variety of "R1b" or "I". If your results indicate "R1b" then you will need those extra markers just to distinguish you from every other "R1b" out there since this is a very common haplogroup. If your results indicate "I" then you'll want to make maximum use of Ken Nordtvedt's information which almost requires those extra markers to place you in the "I" haplotree. The 12-marker test is fine for low resolution work on a global scale but, for precise genealogical comparison and geographical distribution information concerning your particular haplogroup, you really do need at least 25 markers.

To read more about the Cullen DNA results we have so far, I've started a page for Cullen DNA Results. There are still Cullen families that have not yet had any DNA representatives. Cardinal Paul Cullen's ancestors are suspected to be the same as the Anglo-Norman family prominent in Cullenstown, Co Wexford. Some Cullen families of Co Leitrim may have descended from the same Cullenstown family. There is also the possibility that the Cullens of Upton, Nottinghamshire are a branch of the Cullen families from nearby Kent. One of my closest matches is a DNA result for another family in Kent, England. There is also a prominent line of Scottish Cullens from Lanarkshire. By comparing DNA results, supposed connections between these various family lines can either be given strong verification or shown to be highly unlikely. Given the poor paper trail on some of these Cullen family lines into the distant past, DNA is one method available to us to gain more information that would otherwise not exist.

Those of you who are descended from the Cullens of Upton, Nottinghamshire will find the following information to be very surprising, very informative, and at the very least - intriguing. I have already joined the Cullen Family DNA Project and submitted my sample to FamilyTreeDNA for analysis. It's a simple matter of scraping the inside of your cheek with a device that looks very much like a paper toothbrush. Getting the sample really is fast and painless - the seven week wait for the results to come back is not so fast - or painless! The results returned were quite surprising.

According to our results so far, the Cullens of Upton are members of the "I" haplogroup. A haplogroup is simply a group of individuals having DNA with similiar characteristics or shared key values in their results that indicate a common ancestor. Only about seventeen percent of the European population falls into the "I" haplogroup and we are NOT one of the Big Three 'subgroups' in this haplogroup which accounts for 95 percent of haplogroup "I", so we are very much a minority; about six tenths of one percent of all lineages in the world today. Upon further inspection it was found that our particular DNA type has no cursory matches whatsoever in the database of tested individuals, meaning that our DNA type is indeed extremely rare. It is not unusual for DNA types to have hundreds of 12-marker matches and possibly a dozen or more 25-marker matches with varying degrees of relation. At a similiar resolution of searching, our Cullen DNA reveals zero matches.

Our closest 25-marker matches in the YSearch database are at a genetic distance of seven, meaning that the most recent common ancestor is likely to be at least a couple thousand years ago. There is now a temporary page, Search Results, showing the closest matches to my haplotype. There isn't a whole lot of information there besides the tables themselves but it's there for those of you who are interested in the search results. Some of our closest genetic kin have very old roots in England, with a few scattered in Scotland, Germany, Ireland, and France. There will be another page soon also for other Cullens, now that I've gotten the method of transferring search results to HTML table format automated. The other Cullen results are absolutely fascinating and we're lucky to have such outstanding and interesting results for the few Cullens that have tested so far!

After some research and a discussion with Bernie Cullen, it is highly likely that our DNA is a specific and recently uncovered branch of the "I" haplogroup known as "I1b2*" As it is currently understood, the "I1b2*" haplogroup is divided into two varieties, "I1b2*-A" and "I1b2*-B". Several values in our DNA are indicators that we may be of the "I1b2*-A" subclade. For more on the "I" haplogroup, see Ken Nordtvedt's excellent web resource, Population Varieties within Y-Haplogroup I. Ken's work is right there on the edge, pushing into uncharted territory. He's very confident and very good at what he does. Ken's page is definitely THE place to look for breaking news on the "I" haplotree. Be aware that Ken's nomenclature for "I1b2*" in the "I" haplotree differs from the one at the Family Tree DNA website. According to the convention used there, we would be classified as simply "I1*". The difference is that FTDNA does not yet test a couple of the special "SNP's" that some other companies do. Our haplogroup is just very new and testing for it is not yet standard for every company.

So what does it all mean? I'll get to that in a minute... first let's have a look at the results as they are usually presented. DYS values are basically indicators of locations on the Y-chromosome where there is a test point. And the value associated with a DYS location is the number of repeats of the short section of DNA code at that location - thus the term "Short Tandem Repeats" or STR's for short. The collection of STR's at all the DYS locations tested is your haplotype - your genetic signature. If you're already familiar with YSearch, you can find my profile with the code DXF2E. There are two sets of results below - my STR's are indicated by JC and Chris' by CC:

 3
9
3
3
9
0
1
9
3
9
1
3
8
5
a
3
8
5
b
4
2
6
3
8
8
4
3
9
3
8
9
|
1
3
9
2
3
8
9
|
2
4
5
8
4
5
9
a
4
5
9
b
4
5
5
4
5
4
4
4
7
4
3
7
4
4
8
4
4
9
4
6
4
a
4
6
4
b
4
6
4
c
4
6
4
d
JC1425171113181113111211291781010122414192814151515
CC1325171113181113111211291781010122414192814151515

 4
6
0
G
A
T
A
 
H
4
Y
C
A
II
a
Y
C
A
II
b
4
5
6
6
0
7
5
7
6
5
7
0
C
D
Y
a
C
D
Y
b
4
4
2
4
3
8
JC101019191413171835371210
CC101019191413171835361210


Some of the markers in the above table are color-coded. Red indicates markers that Chris and I do NOT match on. Chris and I do not run into a common ancestor until about three generations after Richard Cullen who died in Upton in 1579, making our common ancestor right around the mid to latter 1600's. It is a natural thing that harmless mutations will accumulate at average rates over time in our respective lines. Given the time our family lines have been separate, the above number of mismatches is well within the expected range for natural mutations. This is actually good since our lines now have identifying mismatches. Any Cullen who tests in the future can look to our distinguishing markers to determine which line may represent closer relatives.

The markers in blue indicate those markers that help to identify our subclade as "I1b2*". The first twelve STR's or markers can be used to determine that Haplogroup "I" is the proper designation for our values. Further analysis indicates that we also have some distinctive marker values that indicate "I1b2*". At DYS455,454 we match the unique modal 10,12 for those markers. DYS454=11 and DYS455=8 or 11 for every other "I" haplogroup except "I1b2*". In the marker order that is standard for FTDNA, the distinctive signature "8,10,10,12" at DYS459a,b,455,454 is a sure sign of "I1b2*" status. There are other markers that can be helpful in determining if a haplotype is indeed "I1b2*". The most useful of these indicators is the YCAIIa,b combination of 19,19 which, as you can see in the table, is exactly what our repeats are for those markers.

There are three underlined markers; DYS19, DYS447, and DYS448. These markers help to identify which of the three subgroups of "I1b2*" Chris and I are members of. Since we are almost perfect matches we are of course members of the same subgroup which in this case is "I1b2*-A". I will explain more about these subgroups shortly.

The markers in green, which both Chris and I share, are almost unique within "I1b2*". Without Chris' markers to compare with mine I would almost have suspected an error in FTDNA's labs. Since Chris and I received the same results, a nearly perfect match, we can be sure that our unique mutations are not due to lab error. The rare 13,18 combination at DYS385a,b is found in only three "I1b2*" families; Cullen of Upton, Adam in Scotland, and Miranda in Mexico. DYS385b is normally 16 with a few 15's and 17's on either side so 18 repeats here is not too common at all. The 13 repeats at DYS385a is common enough, but in combination with the 18 at DYS385b, it is a rare find. Even more distinguishing is our shared 14 repeats on marker DYS437. This marker is almost universally 15 repeats within "I1b2*" and there are less than a handful that have repeats other than 15. Only one other known "I1b2*" haplotype shares our 14 repeats at DYS437 and this is a Steinmetz of Maryland, USA. This Steinmetz however does NOT share our 13,18 at DYS385a,b. His matching value of 14 at DYS437 is due to random chance and not close relation.

Chris' family and mine share a set of Cullen relatives in Radford, Nottinghamshire, meaning that our common ancestor was a Cullen from that village or was an ancestor of the branch of the Cullens that relocated there from the nearby village of Upton. Seeing how close Chris and I match on our DNA signature and, knowing how rare this signature is, we may also state confidently that there were no NPE's in either of our lines back to the point of our common ancestor. The 'NPE' is an abbreviation for 'Non-Paternity Event' which is the polite way of saying that there was a 'milkman' in the family tree. At an accepted rate of about five percent chance per generation for an NPE, the 25 plus generations that separate us is an indication that we've dodged the bullet. This is not to say that there is no chance that Chris and I happen to carry the genetic signature of some remote 'milkman' on the shared portion of our family trees, or that such an event never occured ANYWHERE in the Upton Cullen family tree. This only indicates that it would be very extremely unlikely that a 'milkman' exists in either of our family lines back to our common ancestor. The possibility of an NPE elsewhere in the Upton Cullen family tree still exists and is another reason that further DNA testing on related branches of the Cullen family tree is still necessary.

Chris and I represent two separate Cullen lines from early in the family's history in Upton yet we share some very unique markers. If we take our uncommon "I1b2*" subclade markers in conjunction with our rare distinguishing markers within the subclade, then we can in theory, based on current knowledge, compose a set of markers to uniquely identify the Cullens of Upton. We have of course the distinctive signature "8,10,10,12" at DYS459a,b,455,454. Within this group of "I1b2*" we add the uncommon 13,18 combination at DYS385a,b and the very rare 14 repeats on marker DYS437. Until our knowledge of marker differences within the Upton Cullen family tree changes, we can call this the DNA signature of the Cullen Family of Upton, Nottinghamshire:

Y-DNA STR Signature of Cullen of Upton, Notts
3
8
5
a
3
8
5
b
4
5
9
a
4
5
9
b
4
5
5
4
5
4
4
3
7
1318810101214


I find it amazing that one particular family can be identified on the basis of just seven markers. What's even more amazing is that we could probably drop DYS459a, DYS459b, and DYS385a from the above signature and we would probably STILL have a unique set of only four markers that would identify our family specifically from all other known lineages in the world today. I say probably because we don't have a complete genetic map of the Upton, Nottinghamshire Cullen family tree. It's possible, for instance, that DYS437 may be 14 repeats for half of the descendants and 15 repeats for the other half. Our unique 13,18 repeats at DYS385a,b may not apply to ALL descendants. Their may be any number of combinations of the above. There may also be other unique mutations that have not shown up yet because Chris and I are the only two so far to have been tested and we may not have these supposed unique mutations. We're pushing the limits here though since 16 generations is just not enough time to allow that many mutations to occur. We can only hope that, as more Cullens in this extended family are tested, we are able to identify unique combinations of marker repeats for each branch of the family.

There is some indication that these unique mutational characteristics are found only in very closely related haplotypes - if not only in the Cullens of Upton, Nottinghamshire. The closest genetic relatives to the Cullens of Upton can be found in a search for genetic matches at YSearch comparing 37-marker haplotypes. Representative DNA samples from a family by the name of Brooks, also haplotypes belonging to I1b2*-A, are separated from the Cullens by a genetic distance of seven. Actually this is not all that far if you compare this to Chris Cullen's genetic distance of two from my own haplotype. The Brooks, as close as they are genetically, do not have the unique mutational characteristics of the Cullens. In the Brooks samples, DYS385a,b is found to be a perfectly normal 13,16 and their DYS437 is also the very common value of 15 repeats. The next closest genetic group is a family by the name of Chewning/Chowning from Co Kent in England. They are at a genetic distance of twelve from the Cullens of Upton. The DYS385a,b results for this family are 13,17 repeats and their DYS437 results are an expected 15 repeats. For the few haplotypes that share one or the other of our two unique mutations, we find that their overall genetic distance is quite large - meaning that they have acquired a mutation resembling ours but this was by chance and not by close genetic relation. They are located several branches away as compared to the Chewning/Chowning and Brooks families. The unique mutations that Chris and I share must then be confined within somewhere between two and seven genetic mutations away from us.

Continuing with the analysis of the subclade designation for the Cullens of Upton, Nottinghamshire. There are two varieties of "I1b2*" at present; the 'A' and 'B' varieties found by Ken Nordtvedt. Whether a given haplotype is the 'A' or 'B' variety is determined mainly by inspection of DYS448. Refer again to the table where our Cullen DNA results are shown with these markers underlined. At DYS448 you can see that we have 19 repeats, indicating we are likely to be the more specific "I1b2*-A". If DYS448 were 21 instead, then we would be "I1b2*-B". I note also that there are a couple people out there with DYS448=20 so I'd expect to see a possible third haplogroup, "I1b2*-C" sometime in the future. Whether or not this happens depends on determining if this variation is a simple mutation or an actual division of the variety. The data as it stands right now seems to indicate that these examples of DYS448=20 are due to diversity or spread through random mutations over time. There do seem to be weak correlations with other markers so the "I1b2*-C" will stay for now.

As of 31Mar2007, a third variety of I1b2* is official. Ken Nordtvedt has added this variety to his Haplogroup I Population Varieties spreadsheet ( FounderHaps.xls ) at http://knordtvedt.home.bresnan.net. I1b2*-C is characterized by the usual 8,10,10,12 signature at DYS459a,b,455,454 and by 20 repeats at DYS448. Along with DYS448, I1b2*-A,B,C can be identified by 17,16, or 15 repeats at DYS19/394.

There are other markers that help separate the 'A' and 'B' varieties of "I1b2*". I prefer to stick to the combination of DYS447,448 since they are close to each other in the FTDNA marker order and they are fairly reliable. A 24,19 combination here indicates 'A' variety while a 25,21 combination indicates 'B' variety. Low values indicate 'A' and high values indicate 'B' so those of you who have 18 repeats at DYS448 will almost certainly find that you have 24 repeats at DYS447, indicating 'A' variety. Likewise, those with 22 repeats at DYS448 will find elevated values at DYS447 as well, indicating 'B' variety. The following graphic illustrates the difference between the two varieties of I1b2*. The chart is based on weighted genetic distance, according to mutation rate, interpolated between two sets of modal haplotypes, one set for the X-Axis and one set for the Y-Axis. Two things of importance can be learned from the chart. One is that DYS448 separates the two varieties of I1b2* quite cleanly. The vertical line at about X=0.5 separates 'A' from 'B'. The other thing to learn is that, if a given haplotype has 20 repeats at DYS448, then there is a good chance of that haplotype being of the 'A' variety. Note that the green dots mark those haplotypes with DYS448=20. There are only two dots on the right or the 'B' side while the rest are on the left or 'A' side. Only one sets nearly on the dividing line. The graph also shows that, not surprisingly, I1b2* resembles I1c haplotypes more than it does I1b type haplotypes since all of the I1b2* data points lie above 0.5 on the y-axis. My own haplotype is on the 'A' side of the graph. My dot is the top dot in the red triangle of dots at the bottom right of the 'A' side of the chart. Chris Cullen's dot would be found here also. We're somewhat separated from the majority of the other I1b2*-A members, which is to be expected - some of our markers are somewhat off of the modals. This just illustrates that our haplotypes, though clearly "I1b2*-A", lie near the edge of the cluster of haplotypes identified as "I1b2*-A".



For the details on a more in-depth inspection of apparent clustering within the I1b2*, see the page on I(x) Variants. The page gives some analysis of three apparent variants of I1b2*. I would not define them as actual clusters of I1b2* since they do not fall clearly within I1b2* based on DYS459a,b,455,454 repeats alone. I have used the term 'variants' since the defining modals are different. Here I will only give a brief description. I1b2* as you already know is defined as a 8,10,10,12 combination at DYS459a,b,455,454. There are other groups of haplotypes readily identified as I1b2* though their repeats at DYS459a,b,455,454 are not 8,10,10,12. I have found three groups of these variant I1b2* haplotypes. The largest variant group is distinguished by an 8,9,10,12 combination at the defining markers and is a clear variant of the "I1b2*-A" subclade. After an inspection of this group, Ken Nordtvedt has come to believe that DYS459b mutated from "10" to "9" in "I1b2*-A" after it arrived in England. The other two variant groups are smaller and distinguished by an 8,10,10,11 or a 10,10,10,12 variant combination on the defining markers. See the I(x) Variants page for more analysis of these variant groups.

To verify for myself that "I1b2*-A" is the proper subclade for my DNA results, I've written a spreadsheet to measure the differences we have as compared to the modal or most frequently observed repeats for the other haplogroups closely related to ours. I've done a simple distance-squared calculation that's skewed (linearly before squaring) according to the mutation rates of the markers. The result is then scaled to a convenient size result. I did not take into account the distinctive modals observed for the various haplogroups since this is, for the most part, handled pretty well by the calculation itself. In the table below, the yellow boxes indicate the distinctive modal values used to identify various haplogroups. Clear boxes are additional markers used to further define the results or help in identification. Red indicates fast-mutating markers, and slow markers are in blue. The scores along the bottom of the sheet indicate how close my DNA matches the various haplogroups; lower scores being a closer match and zero being a perfect match.



As you can see in the results, "I1b2*-A" and "I1b2*-B" are by far the closest matches, with the A variety having an edge over B. My repeats for DYS448=19 also matches the modal for that variety. The results are in green boxes at the bottom of the table. Recent upgrades on my marker results and more clearly defined modals for "I1b2*-A" and "I1b2*-B" has confirmed that the Cullens of Upton are most certainly "I1b2*-A". According to Ken Nordtvedt, DYS448 is not the only criteria but it is the main one and a very good indicator. Based on the scores in the above table, there is a very large gap in the scores between "I1b2*" and every other haplogroup in the table. There's little doubt about our "I1b2*" classification, especially when the above table was calculated without regard to unique modal features of the haplogroups. To have on hand the required markers to work at this level of detail would require the additional twelve markers available in a 37-marker test. I wholly expect that these extra markers will see extensive use in genealogical work, especially as the world haplotree grows in complexity. I would venture a guess that 37 markers will be about the most needed to place one in the world haplotree; even recent branches have been described in detail with these same 37 markers.

As you may have gathered by now, the mutation rates mean that the number of repeats for any given DYS location may change randomly over periods of generations. The mutations of course are natural and harmless as they occur in the junk regions of the Y-chromosome and so are currently understood to really have no function. For those of you concerned with privacy issues, there is no other usable information to be gained from the knowledge of your particular mutations besides what is applicable to genealogical matters. The mutation rates are quoted per generation, meaning that for any given DYS value, there can be a changed value expected every 400 generations or so. This rate varies according to the DYS location but there are other factors not fully understood that can affect the mutation rates. In effect, the values wiggle back and forth and slowly spread over thousands of years, accounting for the differences we observe in the repeat values for various groups around the world. It's the mutations that have made it possible to trace, classify, and arrange the haplogroups into the world family tree. In this view, all humans alive today are the descendants of just one man, known as "Genetic Adam". Those of you who are not great fans of the Bible, fear not; "Genetic Adam" will occur naturally in a population whether there is a Divine Creator or not.

Given that STR's wiggle and spread over time, it is possible to figure an approximate age for haplogroups and assign a geographical point of origin for the group. As people migrated across the globe they took the mutations in their DNA with them and, theoretically, we should find a concentration of that haplogroup in the present day at the place of origin and a declining percentage of individuals in that haplogroup as we move away from the point of origin. Using the same kind of mathematics, it's possible to calculate an approximate number of generations back for two individuals until we can expect to find the MRCA or "Most Recent Common Ancestor".

This is one of the little mysteries I find for the "I1b2*". The amount of wiggle and spread indicates a fairly young haplogroup, several thousand years or so before present, according to my own figuring. However there is no real geographical point of origin. We are found spread very thinly, according to Ken Nordtvedt, "well-dispersed in continental Europe from Italy and Iberia, in France and Germany, and up through Denmark". This would seem to indicate an older haplogroup, or possibly one that just migrated faster than usual. At any rate, we do seem to be looking at a founder for our haplogroup being located, at least several thousand years ago, somewhere in the area of Denmark, Northern Germany, and the Netherlands. As more data comes in we will likely see that the origin is actually further south. We would likely have migrated eastwards as Danish invaders or vikings who began to arrive in Britain in earnest the latter half of the 9'th century. Even today we find traces of German/Danish DNA in Britain due to the influence of these early settlers, especially areas that once fell under Danish control. Southwell is one such area, where the concentration is three times higher than in other areas. It's no surprise then to find the Cullen family in Upton by Southwell, Nottinghamshire. We know the Cullens were there several generations before Richard Cullen who died in Upton in 1579/80 but it may be possible that we were there or in neighboring Kent far earlier than we previously thought.


DNA Results from Skeletal Remains in Lichtenstein Cave, Germany


One of the more interesting applications of my WSD method of matching haplotypes to modal values attributed to subclades is the 3000 year old skeletal remains discovered in a cave in central Germany. Lichtenstein Cave is located in the Harz Mountains in Germany. The region is what was known as Lower Saxony and so the remains are sometimes referred to as 'Saxon' though it is unlikely these 3000 year old bones represent a people that we would recognize as historical 'Saxons'. What can be said is that it's unusual that these people saw fit to interr their dead in a cave when it was the practice of the time to cremate their dead and bury the remains in the open. It's possible that this was the only choice they had if they were confined to the mountains for one reason or another. There were about forty individuals interred in the cave and, using genetic evidence, it could be shown that many of the individuals were related as an extended family over four or possibly five generations. This is the first prehistoric family to have been identified through DNA analysis. The skeletal remains contained enough viable genetic material to allow sequencing and the Y-DNA STR data is provided as the usual markers we use today.

You can find more information on the Lichtenstein Cave DNA in this Qiagen article. There is also another good article at the ABC News in Science website. You can also find other sources of information using Google or whatever search engine you prefer.

The surprise is that the subclade for the DNA markers sequenced can be identified and that it is very likely that the designation is 'I1b2*-B'. I believe I was the first one to identify the DNA signature and give some quantitative measure to back up this belief. I did this by analyzing the markers using the same WSD method I use to identify modern day subclade designations from Y-DNA STR data. I originally posted the results to Rootsweb's DNA list. You can read the thread here, the contents of which have been reproduced below:

Regarding the 3ky old sample of DNA from Lower Saxony. I've run my own 'predictor' on the data given for his markers. The scale is relative and a lower score is better, zero is a perfect match. A good match is a low score separated from all the rest of the scores by a respectable margin.

The data given was:
DYS393=13, DYS390=25, DYS19=15
DYS391=11, DYS385a,b=13,17
DYS439=11, DYS389i=12, DYS392=11
DYS389ii=27, DYS437=15, DYS438=10

Across the world's haplogroups, the sample scores an average of 50.36 with a min/max range of (19.36-100.63). The score that stands out is Haplogroup I, Ix specifically, with a score of 6.86

J scored 23.39 : I* scored 19.36 : G scored 32.76

I1a scored 23.78 : I1b scored 27.68 : I1c scored 66.46

These are the old naming conventions but it's clear that the data prefers the haplogroups closer to the root of Haplogroup I. Inspection of the markers and simple genetic distance supports this.

I then ran the same data through a second 'predictor', specifically for Haplogroup I and its subclades, according to Ken Nordtvedt's naming conventions. This scale is also relative and, since it spans Haplo-I and its subclades specifically, is a separate scale from that given above.

Across Haplo-I subclades, the sample scores an average of 33.3 with a min/max range of (17.04-103.63). The two scores that stand out are I1b2*-A with a score of 6.86 and I1b2*-B with a score of 5.82

Again the data prefers subclades closer to the root of the haplogroup, scoring in the area of about 17 for them. I1b* scores 17.26 and I1b1*-Isles2 scores 17.17 which was expected since I1b2* and I1b1*-Isles2 have some similar mutational characteristics to I1b*.

The lower score for the B variety of I1b2* should be taken with a grain of salt. DYS385b is the main reason for the lower score but there just isn't enough data to make that call. The scaling system is weighted so additional markers could cause the final scores to drift but I am satisfied, due to good score separation, that the Haplo-I subclade for this data is I1b2*

Jim Cullen


We are currently attempting to contact the researchers who worked with the DNA samples to discuss possibly measuring one or two extra markers from any genetic material ( if any ) left in the laboratory samples. Ideally we would like to see the number of repeats at DYS454 and DYS455 which would verify the I1b2* subclade designation. For I1b2* we should see 12 repeats at DYS454 and 10 repeats at DYS455 - a very rare combination for any subclade or haplogroup in the world. Results of the inquiry will be posted here.

Below is a graph of the WSD ( Weighted Squared Distance ) figures of the Lichtenstein Cave DNA from the currently known subclades of Haplogroup I. Again, lower scores are better matches and zero would be a perfect match. The score is a measure of genetic distance scaled by the mutation rate of the individual markers. This puts all markers on equal footing when the individual scores are summed up. What I look for in these graphs is the lowest score and how far it is separated from the rest of the scores. An ideal match is a score near zero and the next closest match is a score that is much larger. Notice that the lowest points in the graph by a good margin are I1b2*-A and I1b2*-B, with I1b2*-B being a slightly better match due mainly to the number of repeats observed at DYS385b. I1b2*-A scored 6.86 and I1b2*-B scored 5.82 in the Haplogroup I scale.

Graph of WSD figures from Lichtenstein Cave DNA from known subclades of Haplogroup I


An additional piece of evidence turned up when I ran these markers through the 'Search for Genetic Matches' utility at the Y-Search website. There were a dozen results within a genetic distance of 2 and all of them were I1b2* with a fairly equal mix of I1b2*-A and I1b2*-B varieties. One match is an 8,9,10,12 variant of I1b2*-A. The geographic origins are about what one would expect - pedigrees in England and Germany ( one hit in Wales ). Does it make sense that 3,000 year old remains can be identified with a modern subclade? Sure. Our phylogenetic tree begins to branch out 60,000 - 100,000 years ago. Haplogroup I begins splitting prior to the last ice age and continued afterwards as well - this was 15,000 to 25,000 years ago roughly. I1b2* branched off probably midway through the I-Tree. The Lichtenstein Cave remains are dated to roughly 2,700 years ago. Comparitively then, the Lichtenstein Cave DNA is not exactly ancient. Call the 2,700 years roughly 100 generations and then compare this to the average marker mutation period of 500 generations per mutation. In this light I think it can be said that the WSD method should still work fairly well in matching the STR data to the subclades we recognize today. We can be almost certain that the DNA of the individuals interred in Lichtenstein Cave identified as I1b2*-B are exactly what they appear to be - relatives of ours from from roughly 100 generations ago in central Germany.

Document In Progress...




Y-DNA Certificate from FamilyTreeDNA

Y-DNA Certificate from FamilyTreeDNA




Use your Back Button or click here to go to the DNA & Family Traits page