Doing the Math on Dating

I’ve recently found myself single for the first time in almost a decade (I’ll skip the back story if it’s all the same to you), and it’s taken some getting used to.  There’s good and bad both but, on reflection, I don’t think I want to stay single long term.   The bad news is that the percent of the population that is unmarried and has no children starts shrinking rapidly at age 26 and I’m already 7 years past that.  I’d like to believe that somewhere on the edge of the bell curve is the woman for me, but that still leaves the question of where and how to meet her.

A quick look at google provides the baseline number – 7.15 million people in the Bay Area.   I don’t want to drive all the way to San Jose though and I’m not interested in people who prefer suburbs so if I look at just the urban inner Bay Area (San Francisco, Oakland, Berkeley, El Cerrito, Emmeryville, Albany, Alameda) I’m left with 825,863 + 400,740 + 115,403 + 24,048 +10,335+18,969 + 75,641 or 1,470,999 people.

Now if the Bay area conforms to national averages, 39.8% are in their 20’s or 30’s and 53.4% are either never married or divorced.  Out of those, approximately half will be female and 4 percent will have an IQ that’s at least in the same general ballpark as mine (because brains are sexy, dammit).  NPR says 50% of americans are overweight and 15% are obese (obese being a subset of overweight). I don’t mind a couple extra pounds – I’m not as skinny as I used to be – but there’s a fine line there.  I’ll take a shot in the dark and say 40% of people are too heavy for my taste.  A further  20% are too short according to the census data – I’m 6’2″ so anyone under 5’2″ is out.  So that’s the first round of eliminations.

I wasn’t able to find good numbers on the number of unmarried people who are involved in a committed relationship (ie not actually single) but from my own social circles I’d say it’s probably north of 70%.  Based on my preliminary investigations on dating sites, many more are involved in or prefer polyamorous relationships, removing them from consideration.  I’ll ballpark the total between those groups at around 80%, a massive reduction.   The data also shows that highly intelligent women are far less likely to want children,  so I’ll take a whopping 70% cut since having kids is on my bucket list.   I also need to narrow the pool based on political inclination.  Smart people statistically lean liberal, but I’m well to the left of liberal so I’ll take another 50% reduction. Wikipedia says that 15.4% of people in the SF Bay area are Lesbian, Gay, or Bisexual but that number will be higher in the cities I’ve chosen because San Francisco and Oakland’s large  lgbt communities skew the numbers.  I’d be fine with a bisexual woman,  I don’t much care who my partner finds attractive as long as she comes home to me.  Unfortunately I can’t find good numbers on how many of that 15.4% GLB is B, but I know that women are statistically more likely to be bisexual so I’m going to guess they’re over-represented in the female half of that 15.4%.  I’ll ballpark it and guess maybe  10% of women in the sf bay area are probably not interested in dudes.

So my final formula is 1,470,999  * 0.398 * 0.534 * 0.5 * 0.04 * 0.6 * 0.8 * 0.2 * 0.3 * 0.5 * 0.9 =  81.0348

All in, that leaves somewhere in the neighborhood of 81 women who made my criteria for things I can easily find or guestimate numbers for.  Other calculations are harder. For instance highly intelligent people skew towards Atheism, which works in my favor since I’m not interested in religious people.  Unfortunately, I don’t have good figures on how heavy that skew is.  I also don’t have any way to accurately estimate the percentage of my remaining population who are doing something with their lives.  I can’t just use income since for me a person who works at a nonprofit or does advocacy work gets a thumbs up even though they’re likely underpaid, while an investment banker or corporate lawyers gets a thumbs down.  Add to that the massive uncertainty of having had to ballpark so many numbers and there’s a lot of room for error.  This is doubly true because the Bay Area leans left and has an unusual number of highly intelligent people who came here to work in tech so it’s probable that I’ve eliminated too many people by using national statistics that aren’t locally accurate and so my actual pool may be larger than I’ve calculated.  Without better data I simply don’t have a good way to make a more accurate estimate. Even so, my total pool is probably well under 100 women out of a population of almost 1.5 million.

In other words, the odds of meeting a woman who fits even the most basic criteria at a bar are dismal.

That leaves online dating or hiring a matchmaker as my only real options.  Now I love Fiddler on the Roof as much as the next guy, but I want a bit more control so outsourcing is out.  Fortunately, I am not the first person to take a data-driven approach to this topic.  A bit of searching turned up Amy Web’s TED Talk on hacking online dating. Watching it, I had to laugh seeing her go through the process of building an equation very similar to mine. I’m not ready to build a full on ranking system or use dummy profiles to gather intel the way she does, but in general terms I really like her methodology.

Now, in an ideal world, this post would end with a witty rundown of how I used my knowledge of Search Engine Optimization (because an internet dating site is really just a specialized search engine) and marketing to find the person I’m looking for.  Unfortunately, that will take time and I haven’t found her yet.  If and when I do, I’ll write a follow-up.  🙂

I can say that it’s obvious best practice on dating sites to skip irrelevant matching questions about trivial topics since they just muddy the water.  Tastes in music and favorite movies are poor indicators, for example.  And then there’s the task of optimizing my own profile.  As Amy Webb points out in her video, smart people tend to write more in their profiles (certainly true in my case) but the best profiles are concise and easy to parse through.  That means I need to edit relentlessly.  Initial research into what women in my target group are looking for indicates that being tall, white, college-educated, and intelligent works in my favor, but being a bit heavier than I’d like to be does not.  No surprises there.  The “bad boy” factors like my motorcycle and tattoos may actually hurt me, contrary to what the popular wisdom would suggest.  And being divorced definitely hurts me, not because of an objection to divorce on principle, but because people assume it means I have kids and have to pay alimony.  Since neither of those is the case, I’m better off leaving the divorce off my profile and only mentioning it when past relationships come up in conversation.

The next step would be to build a script that can identify popular men in my age cohort and scrape their profiles to look for keywords and keyword density.  I may or may not get that far though, we’ll see.  There’s a lot of additional room for optimization as well, but I don’t know how far down that road I’ll go…  Romance is hard to quantify.

Update: I shared the TED talk and post with Kelly Clay, a co-worker of mine in Seattle who has also had some frustration with online dating.  Her take on the same algorithm based on Seattle numbers provides a solid second look at the same process.

Doing this exercise was interesting on a number of levels because it meant thinking through what I actually want in a partner and how valuable each of my requirements is since every item on the list shrinks the pool of options. What would your algorithm look like?