Saturday, April 29, 2017

Friday, April 28, 2017

Time warp

This is a pop psychology post by a rank amateur. If that bothers you, stop reading now.

Everybody has days that "feel" like they should be some other day of the week. Vacations and three-day weekends particularly mess with your inner settings. I don't know why a Wednesday feels different than a Tuesday, but it definitely does. This week, I feel like I've been off one day. In particular, I was sure Tuesday was a Wednesday and Wednesday sure felt like Thursday.

It occurred to me that this might have been the result of Monday's all-nighter. As I mentioned on Tuesday, it's been several years since I've truly worked through the night. Usually I get at least a few hours sleep.

Knowing that I was probably going to be up all night, I took a bit of a break for dinner, spent some time with Kate, and even had a 20-minute nap. I sat down to work at 8PM and finished up around 5AM. Basically, a full work day. I then took another quick nap before heading in to work. I'm wondering if my brain didn't simply register that session as a full day and throw my internal schedule off.

Thursday, April 27, 2017

No comment

There are times when code comments are very helpful. I could spend quite a bit of time trying to understand the point of this method:

 public long nextLong() {
   this.l.lock();
   try {
      this.u = this.u * 2862933555777941757L + 7046029254386353087L;
      this.v ^= this.v >>> 17;
      this.v ^= this.v << 31;
      this.v ^= this.v >>> 8;
      this.w = 4294957665L * (this.w & 0xffffffff) + (this.w >>> 32);
      long x = this.u ^ (this.u << 21);
      x ^= x >>> 35;
      x ^= x << 4;
      long ret = (x + this.v) ^ this.w;
      return ret;
   } finally {
      this.l.unlock();
   }
 }


But code comments are the absolute wrong place to elucidate on this. Instead just preface the class with something like this:

//Mid-quality random numbers, better than java.util.Random,
// faster than java.security.SecureRandom
//Source: Press, Teukolsky, Vetterling, Flannery
// Numerical Recipes 3rd Edition: The Art of Scientific Computing


Now I instantly know the intent and, if I really want the details, I know what book to read. Similarly, sometimes it's obvious what a piece of code does, but not why. Such situations usually arise around edge cases that someone encountering the code for the first time wouldn't have thought of. A 1-line comment like:

 // if we get here, it means we weren't able to find a match

is a good thing.

That said, the academic infatuation with code documentation is just nutty. Every class must have a header? Really? You need me to explain what this 3-line interface file does? If you do, you really shouldn't be looking at my code at all.

package query;

public interface IQueryDistribution {
   QueryPattern getQueryPattern();
}


Even if you don't know Java, you should be able to figure out from the words "public interface" that this is some kind of interface. The fact that it's called IQueryDistribution is a tipoff that it has something to do with the distribution of queries. Similarly, the fact that the only exposed method is called "getQueryPattern" and it returns an object of type "QueryPattern" is a pretty big clue that any object conforming to this interface should produce query patterns.

Oh, but I didn't tell you all the places it's used and what classes implement it. You know why? I have no idea! The whole point of writing an interface is so you insulate your code from that kind of dependency. Any such comments would be obsolete the minute I turn it over to the rest of the development team.

I think a lot of academics don't realize just how quickly code is developed in industry. When I started programming in the late 70's, extensive header documentation actually made some sense. It probably took all day to write the 50-line function being documented, so spending another half hour filling in the boilerplate header was not terribly burdensome. Code was typically reviewed printed on greenbar rather than online, so you couldn't just right click on an object to get a list of all the places it was referenced. Nor could you instantly reference version control to find out who messed with this code last.

That's simply not the way things are done anymore. With today's development tools, even an old codger like me can knock out hundreds of lines on the rare day I spend actually writing code. Junior programmers who are doing it all day every day may produce over a thousand. Just about everything I used to get from comments can be produced by the development environment almost instantly. The ability to run the entire suite of unit tests in seconds on any code change means that I get immediate feedback if I've misunderstood something. For that matter, assuming the unit test cases really do exercise the object, they provide much better insight into how the methods should be called and the structures returned. If I want to invoke an object, I'll often cut and paste the unit test that matches my use case the best right into my code and modify from there. It's always better to start with a piece of code that works.

Ah, well, it doesn't take that long to put a brief header on each file. So, since I've put way too much effort into this term project to get dinged on comments, I spent a few hours dutifully inserting explanations into my code even though they would already be clear to any competent programmer.

Don't get me wrong; I'm not opposed to documentation. Having some decent wiki-style entries on how to find, fix, and enhance the code is a great thing. The place for those are on the development team's shared web page. Having test cases written down in advance greatly reduces misunderstandings on requirement. Those should go in whatever test management tool you're using. And, of course, if a class is going to be published for general use by other groups, then one really ought to spend a little time writing up some sort of user guide telling what the class does and how to invoke it. None of these things choke your source files with a bunch of unneeded text that simply gets in the way of reading the actual code.

What is silly is the idea that taking code written in a non-ambiguous computer language and paraphrasing it in an ambiguous natural language somehow helps understand it. When the intent is truly unclear, by all means, let the reader know what you were thinking. But, if all you're doing is re-writing your code in English, you're wasting your time.

Wednesday, April 26, 2017

And then it was done

I gave my presentation in Evolutionary Algorithms yesterday. Thanks to the all-night-get-this-thing-to-work session, I actually had some decent results and the talk was well received.

My classes are back-to-back, and in the preceding period, my Set Theory prof said that the class as a whole has been doing well enough on the homework that he's not going to grade the final set of problems. Rather, he's just going to give some informal feedback.

So, in the space of two hours I went from being completely underwater to basically done on the school front. There's still the matter of the finals, but I'm not really very worried about those; I've been keeping up with the lectures pretty well and neither of these teachers are looking to nail people on the exams.

Tuesday, April 25, 2017

All-nighter

Yeah, haven't done one of those in a while. Anyway, my term project is done and it actually works really well.

Final presentation is here.

Sunday, April 23, 2017

Time's up

Looks like I'm going to have to present what I have on my term project Tuesday whether I'm ready or not. Kind of a bummer. I think even one more week would be enough to put something really good together. Maybe it doesn't matter. I will say that if this course wrecks my 4.0, I'm going to be pisssed (though I have nobody but myself to blame).

Thursday, April 20, 2017

Going for the Trinity with evolution

Yes, the metaphor is getting strained, but I've run into a snag with the original "Purify, then Evangelize" strategy on my term project so I've added a third: Communion.

The problem was that if you try to do the purification during the splits, you end up doing it too often. It's a pretty computationally heavy operation and I can't wait all night for the next generation to be produced. So, I've decided that Purification will only happen when the progenitor blocks are selected. For each selected progenitor, one attribute will be selected from the weighted list of attributes and purified. By that, I mean I'll include enough values to get at least 75% of the block carried forward and the rest of the rows get moved to a new block.

Adding in the rows from the non-progenitor blocks then continues as before, but splits will be simple: just create a new block with the same criteria and keep on inserting. This Evangelization step propagates criteria combinations that have been found to be good.

If that sounds like a recipe for a whole bunch of partially-filled blocks, you are absolutely right. So, the new step is Communion. Blocks with very few rows will be consolidated, trying to match criteria as best as possible. Such blocks probably won't perform much better than the "Catch-All" block that has no criteria, but at least there won't be very many of them.

Monday, April 17, 2017

Boston 2017

Like nearly all cities, Boston is host to many running races throughout the year. However, if you say you are "running Boston," it can only mean one thing: the Boston Marathon. Why I thought that signing up for the 121st edition of the worlds longest running annual marathon during a semester when I was taking two classes, plus the Qualifying Exam, as well as delivering a major project at work are beyond the scope of this report. Ill-advised as it might have been, I did, and actually managed to do a better than terrible job preparing for it.

I bring Yaya with me in the hopes that there might be some educational value in visiting such a historic city (note to parents of 13-year-olds: there is NO educational value in bringing a 13-year-old to a historic city). At least she's good company and she gets along well with the children of Lainie and Nick Ives, our hosts for the trip. The Ives are good friends of ours who live an hour north of the city in Newburyport. It's an easy train ride, so it's a far preferable option to sitting around an overpriced hotel room on marathon weekend.

I pick up my number on Saturday, with Olivia and Charlie (the Ives' 13-year-old) in tow. Since the marathon isn't until Monday, we wander around the city a fair bit on foot. Sunday is much less active, with just a short run prior to Easter Mass and some easy walking later in the day. After considerable debating of options we settle on the plan for race day: I will take the train into Boston, drop my bag, and then take the marathon bus out to the start. Lainie will bring the kids into the city later, viewing at Newton and the finish. We'll all go back together. As Nick is an accountant and it's April, he's pretty sure he'll miss the whole thing, but we leave open the possibility of him getting out of his downtown Boston office long enough to see some of the race.

This means a pretty early start for me but, since I usually get up between 4:30 and 5:00AM, not one I'm unaccustomed to. After my usual breakfast of coffee and oats, I collect my drop bag and jog over to the train station to catch the 5:20 into Boston. As we leave the station, I offer payment to the conductor who responds: "runners don't pay today." Well, at least one thing went my way.

The train gets to Boston's North Station a minute early at 6:24, which gives me 16 minutes to drop my bag and get on the buses. It's an easy jog of about a mile, though I had underestimated the climb. Unlike the good folks of Illinois who will name a road "Cliff Street" if it runs atop a 20 foot embankment, the settlers of Boston saved the name "Beacon Hill" for a hill large enough to make a good setting for a beacon. I take it easy figuring that I can always take the wave 2 buses and still make the start.

That turns out to be the second win of the day because, unlike what's implied in the race bible, there is no real break between buses for each wave. They run pretty much continuously, so being five minutes late just means taking a bus five minutes later. I arrive in Hopkinton with an hour to kill before my wave is even called to the start. I spread out the garbage bag that I had intended to use as a disposable jacket and take a short nap on the grass of the High School.

At 9:10 we're called to the start. As that's only a kilometer away, I take my time, not wanting to stand around in the corral for too long. When I do get there, I find the organizers have cordoned off a nice area for warming up and I spend 15 minutes jogging easily before entering my designated corral. Thanks to the result in Milwaukee (only a month after Grad School started), I'm seeded fairly well in wave 1.

Well enough, in fact, that my main concern is not getting trampled. Fortunately, the first mile at Boston is fairly steep downhill and I'm a good downhill runner. So, even taking it easy, I get through the mile in just over 7 minutes and don't hinder those around me. Over the next three miles, I get passed by a few thousand folks, but I don't seem to be causing an obstruction.

Now off the steepest downhill part of the course, the reality of the weather is starting to sink in. It's hot. More importantly, the dew point is very high for this time of year, well above 60F. I wasn't expecting to run a great time anyway, so I'm not too upset about that, but I notice that many of the folks who were eager to get by me are now pulling back on the reins a bit.

The water stations, which come every mile except the first, are crazy as the field hasn't spread out even a little bit and everybody seems to want some. I decide that it's not that hot and I can go four miles between drinks (after all, I routinely do 16-mile split-tempo runs in St. Louis with no water at all). While that's true, at the station at 4, I make a point of getting a bottle rather than a cup of water. I don't really like carrying a bottle in my hand, but it's the best way to ensure I don't get too far behind on hydration.

At mile 7, I pass a guy in a Purdue University singlet. I tell him the last thing we need on a day like this is a Boilermaker.

I get to 8 in 58:45, just slightly on the slow end of my plan of running the first half in around 95 minutes and then adjusting from there. I try not to think about the fact that only a year ago, I would have been hitting 9 in another minute. What I do think about is that there's no way I'm holding even this pace for much longer. The heat hasn't gotten to me yet, but it surely will. The forecast calls for a cold front to arrive soon and I decide to bank on that. If it doesn't pan out, I'll just finish really slow. Won't be the first time that's happened.

And then it arrives.

The spectators probably don't notice much other than a few big gusts of wind. For the runners, it's transformative. Almost immediately, the moisture leaves the air. While the temperature holds in the mid 70's (well over 80 on the road), the combination of the breeze with the fact that sweat can now evaporate results in a immediate cooling of the skin. Within a mile, the cooling has spread to the core. Breathing is relaxed and the legs feel fresh and springy. I toss off my bottle (it's mostly empty anyway). I resist the temptation to drop the pace, knowing that this blessing could easily be squandered trying to claw back time on the relatively level ground through Wellesley when it really should be saved for the hills of Newton.

Speaking of Wellesley...

The women of Wellesley College are out in force as always, collectively audible from nearly a mile a way. Each is brandishing a sign of why they should be kissed and, in an obvious, but still interesting commentary on the tradition, quite a few of the signs indicate they would prefer a kiss from a female runner. I get a quick peck from one who indicates no preference and offer high-fives to several dozen more. It makes for a fun mile and, at 7:20, not a particularly slow one.

I go through the half at 1:36:47. A bit slower than hoped but, given the conditions, I'm quite fine with it. I hold the pace even for three more miles to Newton and then get to work.

Heartbreak Hill is the most famous obstacle in marathoning for good reason. However, it does not stand alone. If it wasn't for the three hills that precede it, any decent marathoner could just suck it up for the three-minute ascent and get on with finishing the race. The Newton Hills make all the difference. None are particularly tough, but they come in quick succession at mile 16, 17.5, and 19. The cumulative effect  is moving the critical distress point (which usually comes at 22) forward to 20, which just happens to be the location of Heartbreak Hill.

I decide I've got enough in the tank to risk hitting all of them pretty hard. The legs respond without complaint. The crowds are considerably larger than 2012, consistently lining the road and 2-3 deep on the uphills. Between the noise and the effort I lose track of where I am. About halfway up Heartbreak, I suddenly panic that this might not be the last climb. I've already committed to it, so I keep the pressure on and am greatly relieved to see multiple signs at the top indicating that this was the the big one.

While uphills can kill the engine, debilitating injuries like muscle pulls are much more likely going back down. Knowing I've gone deeper than I intended, I take it easy on the descent into Brookline. The clouds that accompanied the front have broken and the heat has returned. However, the dew point continues to fall, the legs are still OK, and there are less than five miles to go. There's really no excuse for not finishing this one strong.

While it's a bit warm for marathoning, the weather is ideal for marathon watching. The crowds lining the final flat miles into Boston are even thicker and louder than they were on the hills. It's great fun to be running well in such a situation. While my pace is only a few seconds per mile faster than it's been all race, there are enough runners wilting in the heat that I'm passing dozens every mile. I make the final turn onto Boyleston Street and am greeted by a deafening roar for the last quarter mile. I hit the mats at 3:14:30, which is actually a course PR since my only other run here was the 2012 death march compared to which today's conditions would rate as idyllic. My finish position is 3564, only five places beyond my my 3559 bib number (no way that would have happened without the heat, so I guess I should be happy about the conditions).

As with everything else in this race, the finish is well organized and well staffed. I quickly collect my drop bag, change, and meet Lainie, Yaya, and Charlie. While I didn't see them on course, they saw me at both Newton and the finish. Perhaps more importantly, they got to Newton in time to see both the elite men and women come through. That was great for them. You just can't comprehend how fast they are running unless you see them come by in person (even then, you can't comprehend it, but at least you've seen what you didn't think was possible).

For me it was a huge step forward in my migration from "races" to "events." I've made incremental progress on that front, but this was the first time I was able to run an event and truly put competitive thoughts out of my mind. Given my current fitness, it was a decent run and I was able to embrace it as just that. The fact that I've gone considerably faster on other occasions and got beat by 162 other guys in my age group simply seemed irrelevant. I just enjoyed running. And running is something I truly enjoy.

Sunday, April 16, 2017

Heating up.

It must be me. Looks like we're going to have another warm one for Boston. The only other time I ran it (2012), it was the warmest in history. Anyway, since I'm not going to run fast anyway, I'm happy to have the excuse.

Also warming up is my term project, which still isn't done. This is getting serious. I still took today off because, well, it is Easter and I take this whole resurrection thing fairly seriously, but I will really need to hit it this week to wrap this thing up.

Wednesday, April 12, 2017

You can't be serious

"Take some time off." "Surely, you can't be serious." That's Kevin Lomax's response to his boss' suggestion that he take some time to tend to his life outside of work in The Devil's Advocate, one of my favorite books and films. He doesn't, and loses everything in the process.

I'm on my way to Boston. This trip can be charitably described as "ill conceived." "Bat-shit crazy" also comes to mind. Taking a week off in the middle of my toughest semester the day that my biggest project at work goes live. The most amazing thing is that absolutely nobody has told me to do otherwise. Maybe the stress fractures in my life are more obvious to others than I realize.

Anyway, I'm going, and Yaya is coming with me. We'll deal with the fallout when I return. Posting may be a bit sporadic in the meantime.

Tuesday, April 11, 2017

Monday, April 10, 2017

Easter evolution

Pardon the metaphor, but I do think this is the easiest way to think about how my evolutionary algorithm will work. I bring Easter into the mix because the evolution will be a balance of two strategies: Purification and Evangelization. (Most of the class is Hindu, so it remains to be seen how well this analogy works for them).

The basic strategy is to select some progenitor blocks and rank them by fitness (as measured by the last set of query runs). The remaining blocks need to have their rows allocated to progenitor blocks. The BlockAssigner does this. The algorithm for the BlockAssigner is as follows:

Try each block in order of fitness. On the first row where the criteria for the block accepts the criteria for the row, assign the row to that block.

Easy stuff, but what if the block is getting too big? That's when we have a decision. We can either purify the block or evangelize it. Purification means finding a subset within the block that meets even more stringent criteria for inclusion. For example, suppose a block is entirely composed of records from business unit 17006. Suppose also that 80% of those records are also from the Product Line of Whole Life Insurance. We can purify the block by adding Whole Life to the criteria for the block. The remaining block will now be even more specific and more likely to be excluded from queries where none of the rows are relevant.

What about the rest of the rows? They get thrown back into the mix of rows to be re-assigned to other blocks. Obviously, you only want to take this strategy when the purification is meaningful. That is, the new attribute is high on the list of relevant attributes AND most of the block conforms to a small number of values for that attribute.

Suppose not. We've already got a good block on our hands (since it was selected as a progenitor), so the next best thing would be to spread that around. We look for a way to split the block such that we wind up with two relatively equal sized blocks based on either a new attribute or by sorting out values from an existing one (for example, if the block had rows from business units 17001, 17005, and 17006, we could put 17001 and 17002 in one and 17006 in another).

My gut feel is that it makes more sense to bias the algorithm towards purification. That is, if there's a reasonably good way to increase the criteria, do it. Evangelization will be a natural outgrowth of blocks that are already purified to the point where further refinement doesn't gain much. The idea is actually quite consistent with the Easter message. First, get your own act together, then spread the message around.

Finally, all this assumes the existence of a "catch-all" block for rows that don't meet the criteria for any other block. This block will also be split when it gets too big with a very heavy bias towards purification (leaving the remaining rows in a new catch-all block),

Sunday, April 9, 2017

SLOC S-F

Bagged an overall win today in the SLOC Orienteering meet at S-F Scout Ranch. That's not really setting the bar very high since the only other locals who can match my nav on technical terrain are Rick Armstrong (who's well over 60) and David Frei (who has ALS). Still it's always better to win than to lose. More importantly, I felt like I was navigating fairly well, which was a pleasant surprise given how little practice I've put in lately.

The boom going to 2 was simply inattention; it was easy to fix once I realized I'd gone too far. Five was a more serious problem. At least part of it can be blamed on the fact that the map is pretty old and the vegetation boundaries have moved a bit. Mostly, though, it was just a classic case of missing. What I like about that leg is that I didn't spend anty time wandering. I immediately realized I'd missed and took the hit relocating off the trail. That's the difference between a 2-minute and 10-minute mistake. Hoping that you'll fix it quickly by looking around is not a sound strategy; if it was an easy fix, you wouldn't have missed it in the first place.

The boom at 13 looks way worse than it was. I intentionally aimed off to the right knowing the lake would catch me. When it did, the adjustment was trivial. Because I was running full speed the whole leg, I don't think I lost more than 30 seconds on that one.

I love S-F. I've always said it's our best map (though I certainly understand the case for Hawn). I was happy to run fairly well today. It was a very fine day in the woods.


Saturday, April 8, 2017

Yaya's solo

It's easy to forget she's in 7th grade. As you can see from the picture, she's full adult sized now and has been for quite some time. She also plays better than all but the best High School students. This isn't just proud papa talking; she's been placed into every honors ensemble that she's eligible for.

Anyway, she played an adjudicated solo today and, frankly, it wasn't the best she's ever played it. In fact, it was a pretty average practice run. That's not to say it was bad by any means; she practices pretty hard. But, it wasn't turned up a notch like most of her actual performances are.

She obviously perceived this as well and seemed a bit despondent as we left the room. I told her that there will always be times when you don't perform as well as you know you can, but if you've really done your work, even less than your best can carry the day. The important thing is to not give up while there's still life in the game. To her credit, she definitely did not give up. She fought to the end of the piece and the last bit was probably the best part.

That last paragraph is somewhat fictionalized because I actually only got about halfway through the above statement when her accompanist ran up and told her she had gotten the highest score. Apparently, the judges are better at remembering that she's a Middle School kid than I am. By Middle School standards, she crushed it.

I thought back to my first Middle School solo competition, also as a 7th grader. The piece wasn't nearly as hard as the one Yaya played today. As beginner trombone pieces tend to be, it was a bit raucous and not only allowed, but encouraged some sloppiness. I got full marks, but I don't kid myself for a minute that I actually played it perfectly. Yaya's piece was a legit performance piece where any flaw was immediately exposed.

What impressed me most was that she spent several hours later in the day practicing. Most kids (and adults) would take a max score in a County-wide competition as permission to goof off for at least the rest of the day.

I've always wondered how things would have gone if I had chosen music instead of cycling. I'm pretty sure I know now. I had technique and I understood music, but I never had the passionate drive that Yaya shares with people who really get good. I would have been toast. What lies ahead for her is anybody's guess but only a fool would bet against her.


Friday, April 7, 2017

A tough week, but a good one

Going into last week, I noted that I was not finished on any of my major items. I'm still not, but I've certainly have made a lot of progress.

I turned in a reasonable stab at HW3 for Set Theory. It will need to be resubmitted after I get the comments back, but I wasn't embarrassed by what I turned in (though, I might be when I get the comments).

I've finished the query generator for Blocker and have a good direction for the evolutionary algorithm. I'm going to drop the Bayesian stuff for now and just use a heuristic. I give it 50/50 that I'll have something I can run by the time I leave for Boston.

At work we got through quite a bit of stuff prepping for production. Go-live is still tentatively scheduled for next Wednesday, though I'm not seeing how we get final signoff in time for that. That's kind of a bummer since it means I'll be in Boston when they turn it on. At any rate, now that we've got our Vertica instance running on a real cluster, we're getting billion-row queries back in 3-4 seconds, which is a big improvement over the old system. Load times are also faster by an order of magnitude. The old system loaded at just under 100 million rows per hour. The new one is around 1.2 billion and most of the time is extracting the data from Oracle; it will be even faster than that once the source system is writing directly to Hadoop. Given that the requirement is that we load 8 billion overnight, that increase is pretty crucial.

Thursday, April 6, 2017

Evolution

With the data layer and query generator done, it's time to get serious about the actual algorithm I'm going to employ for Blocker, my term project in Evolutionary Programming. Here are some thoughts.

The general flow is as follows: run a set of queries on the data and score the blocks. High scoring blocks get carried on without modification. Low scoring blocks are dissolved and their contents are copied back into the data base using the block assigner. The evolutionary part is that the block assigner changes with each iteration as more information is gained about the distribution of the data and the queries.

The novelty, (it's not much of one, but the term paper doesn't require any original work; just an implementation) is in the way the block assigner evolves. In a normal Evolutionary Strategy setting, the evolution comes by perturbing the data randomly and then comparing the fitness of the new generation to the old. That's too inefficient for this application because reblocking the entire database takes quite some time.

Instead, I'll have a prior distribution on blocking preferences. From the query results, a posterior will be computed and that will be used to set the distribution for preferences in the next round. For this to really work well, it has to take into account the correlations between attributes. I'm not sure if I can go beyond pairwise correlations for this effort, but even that should help quite a bit.

A couple other quick thoughts on how I can get generations through faster:

  • Time box the operation. That is, keep generating and running queries as until a fixed time has passed rather than a fixed number of queries. This allows early generations, which tend to see big gains, get done quickly. Later generations, where the improvements are much smaller, will get a larger query base because the query times will drop as a result of the optimization so far.
  • Change the sample threshold over time. Start with sparse sampling, maybe only actually reading 10% of the selected blocks. Increase the percentage of blocks actually read as the posterior distribution gets better and can benefit from more data.

Wednesday, April 5, 2017

Sourdough

I spent 12 hours at work today, so writing about math is not really at the top of my agenda tonight. However, I promised my friend Bill that I'd give him instructions for the sourdough starter I shared with him last weekend so, why not, I'll post it here.

This starter was created in 1874 by Mrs. Sheldon Goodwin of El Dorado, Arkansas. It was then shared with Gladys Edwards who subsequently shared it with Nathaniel Pyron. Nathaniel shared it with me in 2016. The original recipe creates a fairly sweet version of sourdough. I’ve modified it to bring it more in line with a traditional sourdough, though the yeast does not produce the really heavy San Francisco sour. I always use weight when working with flour since volume can change significantly due to settling. However, 5oz – 1 cup is a reasonable approximation if you prefer to measure by volume.

As with all yeast, rising times are very dependent on temperature. At 70F, expect the rising process to take 36 hours. It can be accelerated by warming the dough, but the longer rise time will yield a more robust flavor. My personal preference is to go the other way and let the yeast rise in the refrigerator, at least most of the way, to bring out the most flavor.

Preparing the starter

The starter is 1 pound, divided 50-50 between water and flour. Give the starter at least two days in the refrigerator after the last division. The starter will keep indefinitely at refrigerator temperatures provided it is fed a tablespoon of sugar every couple weeks.

Put the starter in a medium mixing bowl (opinions vary on whether to include any surface water; I always include it to keep the ratio right). Add 8 ounces water, 8 ounces bread flour, and 1 ounce sugar and stir until combined. Cover the bowl and return to the refrigerator for 24 hours. The starter should double in size during this period.

Divide the yeast, placing one pound back in the refrigerator for future generations (of dough, not people). The other pound is ready to be used.

Preparing the dough

You are looking for a 5:3 ratio of flour to liquid. However, keep in mind that all liquids are not created equal. I like using buttermilk, which has a large quantity of fat. Therefore, I have to increase the amount to get the ratio right. The following table should help:

floursaltoilwater*buttermilk*makes
7oz.3oz1.5oz1oz1.5ozsmall loaf or 14" pizza
12oz.4oz2oz4oz5ozsmall loaf and some rolls
17oz.5oz2.5oz7oz8.5ozsmall loaf and 14" pizza
22oz.6oz3oz10oz11oztwo loaves

*In case it's not obvious, these are extreme points; only add one or the other. For liquids in between (milk, broth, beer, etc.), start with the water ratio and add as needed while kneading. The amount of oil depends on what I'm making. For pizza crust, I add slightly more. For bread, a bit less and, if I'm using something like beef broth as the liquid, I'll leave it out altogether.

Mix with wet hand or spoon until the dough comes together. Let rest for a few minutes. Knead with mixer or by hand until dough is not sticky (using a Kitchenaid with a dough hook, I find that 1 minute on low, 2 on medium, and 3 on high works well). Remove dough from bowl and fold several times on floured countertop. Cover dough with plastic wrap and return to refrigerator for 12-24 hours.

Baking

Dough should have roughly doubled in size. Fold dough a few more times to further develop the gluten and release most of the gas. Where you go from here depends on what you are making.

For bread, form into final shape and place in pan. Allow to rise (doubling in size again). This may take a while if you were doing the first rise in the refrigerator and the dough is still cold. I'll often do the forming right before bed and bake first thing in the morning. Bake at 375 for 10 minutes then reduce to 350 for another 30-35 minutes until internal temperature is at least 180F.

For pizza dough, form into dough ball and let rest for at least 30 minutes (up to 2 hours if it's coming out of the fridge) Flatten with palm of hand. Let rest for another 30 minutes. Roll or throw dough to desired thickness. Top and bake. For thin crust, 500-550F for 6-7 minutes. Thicker crust and heavy toppings require more time at lower temp.

For rolls, divide into 2-3oz rolls and roll into balls using palms of hands. Place either in muffin tins or together in a pan. Allow to rise at room temperature for another 2-3 hours. Bake at 375 for 20-25 minutes.

Epilogue

You'll be hard pressed to find a baker who hasn't killed their yeast at some point. The defense against this is to share it with your friends so you can get it back. If you want some, let me know. I'll happily split a batch for you.

Tuesday, April 4, 2017

Set Theory HW3

Rather pressed on this one; I'm sure I'll resubmit it. First effort is here.

Sunday, April 2, 2017

HW2 corrections

Set Theory HW2 had enough small errors (and a couple big ones) that I re-submitted it. The updated proofs are here.

Saturday, April 1, 2017

Reims to Nimes

If you came here hoping for a race writeup of the Firehouse 5K, let me not waste any time in disappointing you: 19:53, 6th overall, 1st over 50. Predicts a 3:10 at Boston which absolutely won't happen because I have no intention of running Boston full-on. Enough about that.

Instead, I'm going to write about how I celebrated passing the Q. I have a few friends who can comprehend both what the Q represents and also appreciate a good celebratory meal, but none more so than Bill & Laura Langton. Therefore, we had them over for dinner tonight. Kate and Laura are somewhat indifferent to French wine but Bill & I are quite passionate about it. The theme of the night was Riems to Nimes. In my mind, that corridor represents the absolute pinnacle of culinary excellence.

I tried to come up with a menu that followed geography north to south or vice versa, but found that a bit unwieldy. So, I settled for just keeping the wines from somewhere between Champagne and Provence and then paired the courses as best I could. Here's the final menu:

Appetizers: 2014 Domain Servin Chablis Les Pargues with Crab dip and chips. While paring seafood with a chardonnay is obvious enough, I probably wouldn't have thought of this simply because of the standard maxim: what grows together goes together. As Chablis is a land-locked appellation, crab wasn't even on my radar. However, the more I read about it, the more I kept seeing that Chablis and crab are made for each other. It's true, the match was perfect.

Salad: 2015 Miraval Cotes de Provence blush with Spring Greens, Goat Cheese, Cranberries, Pecans, and Olive Oil. Another revelation. This course was added when I found out (at the last minute) that Yaya's evening plans were cancelled so we were going to have both her and her friend at dinner. I originally had the salad as part of the next course (baking it directly on the flatbread). I asked about how one could best bring a Provence Blush into a salad course and Kevin from the Wine and Cheese Place suggested not using dressing at all, but rather feature goat cheese with just a bit of olive oil. Total home run. This pairing was sublime.

Intermediate: 2009 Moulin a Vent Jules Desjourneys with pesto, bacon, and onion flatbread. As much as Bill's CdP was fab, this was the wine of the night for me. Maybe it's just because I've never had a Beaujolais of this quality. I expected it to be good because it came with a good recommendation from a reliable source (a master sommelier who knows my preferences) but, damn, it was really, really, really good. And, the flatbread I made to go with it was one of my better efforts as well. My usual sourdough crust seems to like more savory sauces like pesto (for a normal red-sauce pizza, I use standard active dry yeast). The bacon did a great job of bringing out the deeper notes of the wine and the onions, well, what's not to like about caramelized onions?

Main: 1998 Bosquet des Papes Châteauneuf du Pape with slow cooked sirloin and peppers over cabbage. Yeah, this was pretty much the bomb. Bill supplied the wine. The stew was pretty straightforward. The one twist that worked great was that, rather than throwing the bell peppers in the croc pot with the meat, I roasted them separately and then added them as a garnish. I think they retained a lot more flavor that way. The other thing was searing the meat in the fat left over from the bacon.

Dessert: Non Vintage Bollinger Champagne with bread pudding. Ok, bread pudding isn't even French, much less from the Rhone corridor, but Kate likes it and Bill makes it really well. So, since he insisted on contributing a course, that's what we went with. And it was super good. I generally buy grower Champagnes as they tend to be better deals. Kate gave me this bottle for Christmas and, while I'm sure I could have found a comparable grower wine for less, this was certainly ample evidence that the big houses do know what they are doing. Meanwhile, Bill's bread pudding is quite good and his sauces (yes, plural, he brought three) are crazy good.

Yeah, it was a good evening for sure.