Wednesday, December 05, 2012

Thank You, Gordon Moore

In October of 2007, I wrote about hardware assisted brute force attacks.  Namely, that it was an error to assume today's computing power was what we'd have access to in the future.  For example, if you know some computing operation is going to take ten years, one option is to wait five years, at which point (assuming Moore's law) your operation will only take 2.5 years.  You save 2.5 years.

Well, today I stumbled across this:
In a test, the [GPU cluster] was able to churn through 348 billion NTLM password hashes per second. That renders even the most secure password vulnerable to compute-intensive brute force and wordlist (or dictionary) attacks...this means we could rip through any 8 character password (95^8 combinations) in 5.5 hours.
The point is: if you wait five years, the hardware you get is exponentially faster than what was available at the time.  In 2018, are we going to be down to minutes?  And what about combining this approach with rainbow tables?

Sunday, November 18, 2012

Rewriting directshow?

Directshow is a pretty tenacious interface, having been used for approaching two decades at this point.  The official party line from Redmond is "use MediaFoundation," but it's never seemed particularly useful given that MediaFoundation is absent in older versions of Windows.

I've always (perhaps foolishly) thought it'd be fun to write my own media framework.  So I spent some time figuring out what I would hope to improve on over Directshow.  I came up with the following list of things:

  1. I dislike most things about IMediaSample
    1. I dislike the way it communicates a media type--through AM_MEDIA_TYPE.  AM_MEDIA_TYPE is weird to work with, easy to use incorrectly, and littered with buckets of legacy junk that nobody really cares about anymore.  It is easy to leak memory with it.  It is difficult to understand.
    2. I dislike that media samples themselves don't have a method for communicating their media type.  You might counter with IMediaSample::GetMediaType(), but that method provides the media type only if it differs from the previous media type.
  2. Filters consist of separate "pin" and "filter" interfaces despite being one object; this distinction has always been nonsensical to me.  A pin is really the types of input or output a filter contains, as well as a means of "connecting" to the filter.  The filter is...presumably everything else.  The separation between the objects makes writing filters messy, and communicating with them odd.  And I don't see a clear reason for the distinction.  It also creates coupling problems (for example, see CSource and CSourceStream in the DShow base classes, particularly how a pin adds to a filter and deals with reference count)
  3. Support for B-frames is marginal at best.  DirectShow has only a timestamp--not separate timestamps for presentation and decode time--which makes handling certain types of media difficult.
  4. I dislike how duration is represented--100 NS units.  A better way to represent duration is to give individual media streams a timescale--this avoids nearly all rounding error and most floating point math, except at endpoints and render time.
    1. When you think about it, specifying the duration of units is really the same as specifying a timescale, except less flexible.  A duration based on 100 NS units is the same thing as having a timescale of ten million (there are ten million 100 NS units in one second).
    2. Example: we're processing 8 khz audio.  The best timescale to use here is 8000.  When buffers come along, the "duration" of the buffer is simply the number of samples contained within.  Let's say incoming buffers contain 1024 samples in it; our duration is set to 1024 as well.  Total sample duration is 1024/8000 seconds.
    3. Example: we're processing H.264.  Many applications use a timescale of 90000, although accuracy up to DShow's 100 NS units can be achieved by setting the timescale to ten million.
    4. This last example is worth mentioning--compatibility with other media frameworks--like DShow--is easy.  Set timescale to ten million, voila: no translation necessary.
  5. I dislike DirectShow's sample pool concept.  This was probably a great design choice in the early 90s, when processing media was a pretty serious task, but these's irrelevant.
  6. Automatic graph building is a cool idea, but its never worked well for me in production code because random filters may be inserted into the graph.  It's neat for quick-and-dirty apps that do very little.  It's pretty much pointless for a serious production application.
  7. Surprisingly, many of the filters I'd want built-in are often absent.  For example, there really isn't a comprehensive filter for dealing with colorspace conversions, resampling audio, etc.  Many of the "bread and butter" filters you often desperately want aren't available.
  8. Live video streaming support is horribly complicated to implement, and it's questionable if most down-stream filters are even following suit with the API.  Getting AV sync to work well with a live stream literally gives me fits.  
  9. In general, a very complicated API.  Likely because at the time it was dealing with some very complicated subjects.
On the other hand, there are things I really like about DirectShow:
  1. It can be extended to handle nearly any audio or video type, metadata types, etc.  It is extremely flexible.
  2. Because it is so modular, often a major change to your application only involves replacing a filter or two here and there, so changes end up being quite tidy.  For example, I have a playback application using MP4 files containing H.264 and AAC.  Someday, if the business requirement changes, I only have to swap out one filter and make sure it implements IMediaSeeking....and everything should literally work as it previously did.
  3. IFileSourceFilter is a great idea.  I don't like the naming, but having a standard interface to configure source filters makes it much easier for client writers to swap out filters.
  4. IMediaSeeking is a great interface for advanced seek operations.  It's great to have this work well at a high level without having to be overly concerned with the guts of the underlying filter graph.
  5. Some of the utility filters, like the infinite tee, null renderer, sample grabber, etc. are always super handy.
  6. The event-driven framework--while overly complicated--is really useful given that processing audio and video (particularly in real-time) is a very "event" oriented process.
In the coming days, I'm going to take a shot at implementing a very basic media framework in C#.  People have said managed code isn't suitable for media processing; I plan to test that hypothesis, because I see no real evidence showing otherwise.

I'll update my progress here as I go.

Monday, October 29, 2012

State Machines Made Simple

I recently stumbled across a bug in some code where our embedded device would sometimes fail to take the correct behavior on disconnecting from our back-end servers.   This was actually the third fairly major bug in this particular code. When I dumped the code to a state machine using gliffy, it became clear we had a problem with state.  It looked something like:

Okay, wasn't that bad.  But it was very, very far from good.  Here's a (vastly) simplified overview of the flow of the code:

bool someState;
bool anotherState;
bool yetAnotherState;
while( someState )
    // some initial work...

while( !anotherState )
    while( ... )
        // some initial work again...
    // Wait here until something happens 

    // ruh roh, a disconnect, loop around...
    if( yetAnotherState)
        // special blah here...

...full version of the "driver" method was ~200 lines of code, not counting methods that loosely corresponded with the state transitions.  State was largely tracked with a motely assortment of boolean values.  The nested while loops were difficult to understand, as was the initial loop that seemingly duplicated the first inner loop.  Transitions between state were littered between the driver method and the various state methods, in part because there really was no clear home for them.

If you can clearly communicate your process with a state machine, it can be translated using a pretty typical design pattern that A) makes it very easy to understand, B) very easy to modify in the future and C) very reliable.  For instance, consider this basic (and yes, slightly silly) state machine--I was doing laundry this evening, so this was what immediately came to mind:

I obviously make no claims that this method of doing laundry is correct (I will claim my wife has drilled me into adding bleach to the water before putting the clothes in, for what it's worth).  But that's really not the point.  The point is this can translate quickly, cleanly and reliably to working code.  Start by defining all your states:

enum WashingState

Then, in code, do the following:

while( not done )
    switch( current state )
    case Start:
    case AddDetergent:
        current state = have detergent ? StartWash : OhNoesGoToStore;
    case StartWash:
        // start the wash
        current state = AddBleach;
    case AddBleach:
        current state = have bleach ? FillWithClothes : OhNoesGoToStore;
    case FillWithClothes:
        // Fill with clothes...
        current state = WaitUntilComplete;
    case WaitUntilComplete:
        // wait until done
        current state = Start;
    case OhNoesGoToStore:
        // return....nothing left to do.

In a nutshell: your enum values correspond with states.  The switch statement handles transitioning between states, and the while loop causes the switch statement to repeat continuously, until told not to.

In some code it may be more appropriate not to have a long-running process, in which case having single methods that allow atomic transition of state is probably a better way to go--but even then, the pattern really remains the same--individual methods correspond with your states, and those methods allow atomic transitions from one state to the next.

But what I enjoy about the "while( not done ) { switch( state ) {} }" pattern--particularly when it's being applied to something like a long-running thread--is it's neat, well contained, easy to understand, and easy to extend and/or modify in the future.

A few passing comments before I close this out:
  • Ignore articles like this one that claim "there is no way to enforce state transition rules" in switch-based state machines.  This is clearly a claim desperately in search of a weird over-engineered solution; it is trivial to enforce state transition rules.
  • Avoid allowing outside forces to influence state, particularly if this is a dedicated thread.  If you must, the best option is to perform this transition immediately following the switch statement, before looping around the while, to "trump" whatever choice made by the case label.  Not pretty, but neither is life sometimes.
  • Keep It Simple Stupid. There are all sorts of fancy methods out there--hierarchical state machines, crazy inheritance, template wizardry (there is some relationship between Dr. Dobbs and a total inability to embrace simplicity), etc.  It is super awesome you can make C++ jump through hoops, but please, spend thirty seconds thinking about the last time somebody handed you a plate of completely incomprehensible code.
  • Consistency matters.  Not for you, probably, but for the next person to maintain your code.  Try to be consistent about where state transitions happen, locking (if necessary), and structure of the code.

Tuesday, February 21, 2012

Thunderstone, Or An Abbreviated Guide to Comp Climbing Treachery

Lately in my spare time I've been playing Thunderstone with my wife and an acquaintance.  It's a fantasy card game where players basically build up a deck of cards during the game in an attempt to vanquish monsters in a dungeon.  There's a lot more to it, of course, but the strategy and creativity of the game is really wonderful.

Anyway, our acquaintance is an avid gamer who's clearly spent an inordinate amount of time scheming about strategy, tactics and technique for winning at these games; he is a cunning adversary, and that's what makes it fun.  So it should have come to no surprise when he was dumbfounded at the complete lack of strategy some climbers demonstrated at a local bouldering competition.

The comp was fairly typical.  It has a qualifier phase where climbers could pick and choose problems to try from fifty to seventy five options of varying difficulty.  However, only the top three problems would be considered for a climber's final score during qualifiers.  Based on this final score, three climbers would proceed to finals.  The finals format was also fairly common: climbers were shown three problems, and given five minutes to attempt each.  Between attempts they were given five minutes of rest.

My friend inquired about these rules, and I explained them, as well as some of the tactics I generally employ (note these don't hold true for all comp formats).  In no specific order, during qualifiers:
  1. Warmup.  This gives me a chance to get ready for harder problems.
  2. The goal of qualifiers is to do the minimal amount of work to get to finals.  Burning yourself out in qualifiers is a sure-fire way to fail on finals problems, which are typically more difficult.
  3. Never get on a hard problem before someone else.  Best case scenario: your competitor is unable to finish the problem, in which case you've earned free beta ("beta" is info about how to do a climb) and you possibly don't even need to attempt it anyway.  Worst case scenario: they send and you still get free beta, hopefully enabling you to send it quicker and easier than them.
  4. Religiously track who's climbed what.  Even if your competition does do some problem more difficult than you, who gives a shit?  You need to get to finals, not thug it out with some rock-jock during qualifiers over a fake prize.
You'd think these rules would be obvious, but my friend seemed dumbfounded how the third rule could ever be feasible; he aptly noted that if everybody waited for someone else to get on a hard problem, then nobody would ever attempt anything difficult.  And this is true, assuming everyone there has considered strategy.

But the reality is they haven't.  Most comp climbers spend roughly five seconds considering the best approach to maximize success before throwing themselves at some stupid-hard problem.  For example, at this comp, I waited for my competition to solve a rather cryptic problem for me.  My two finals competitors did eventually climb the problem--but it took them a handful of costly attempts, whereas I was able to recycle their efforts unlocking this problem for a single-attempt ascent.  I believe this is a win not only from an efficiency standpoint (one try is way better than multiple tries), but for your competition it can be sobering from a psychological standpoint.

Even worse, both of them spent time and effort on problems that were irrelevant to getting into finals.  Had either of them taken a moment to survey the field, they would have realized they were virtually guaranteed a finals seat since the only other climber doing problems in their range was, well, me.  Instead, they both attempted a problem neither of them ultimately sent.  But had either of them actually succeeded at this problem, it wouldn't have changed the fact that they were still headed to finals, which is all that really matters during qualifiers.

In finals, the strategy is different.  You cannot leverage someone's beta in this format since you don't get to watch them climb.  But there's always strategy:
  1. With five minutes of time, realistically, you'll get three to four good attempts.  
  2. Your first attempt will also (likely) be your strongest.  Thirty seconds between attempts on a very hard climb simply isn't enough time to recover.  The first attempt is often worth the most since you get bonus points for climbing something on your first try (i.e. flashing).  Not in all comps, but most.
  3. This means you should have a damn good plan of how you're going to do stuff before you pull on the wall.  I generally take at least a minute surveying the problem before I pull on.
  4. Know when to cut your losses.  This can be hard to gauge, and also slightly heartbreaking to give up.  But if you honestly don't see yourself sending a problem, save energy for the next one.
Again, neither of my competitors seemed to realize any of this.  They spent seconds glancing at the problem before attempting, hurting their chances at first-attempt success.  They made too many attempts.  They did not save energy for future problems.

Combine their errors in the qualifiers with a lack of strategy in finals, and you end up with a commanding win by me.  Which, of course, I like.  But I have to wonder: did age and treachery simply beat youth and talent?  Is this really a legitimate part of "climbing," especially when many climbers would attribute this sort of strategic whoring (on plastic, no less) to pure foolishness?

It's not clear.  But I have to admit I really love strategy.  I love finding that chink in the armor, that little advantage that gives me some edge, my ace in the hole.  It's what I love about Thunderstone.  It's what I love about real stone.