Tuesday, March 16, 2010

Finally Some Usable Storage

I've tried a boatload of different synchronization tools and finally found one that actually works as advertised and meets my needs: Dropbox. Since I'm a very picky consumer (read: perfectionist), it is rare that I say a product is very good.



So what makes this product unique? It works simply, works quickly, and it is free for normal consumption. Considering that my definition of normal is anything but the norm, that's a big statement.

Basically, you install the Dropbox client on your machine (Windows, Mac, Linux, what-have-you) and set the location for your Dropbox files. Anything that shows up there is automatically updated on the server. By using multiple clients with the same account you get real-time-enough (that's a technical term) synchronization across multiple machines.

I use Dropbox for all my OneNote notebooks. And my work documents. And some source code. And some pictures. Pretty much anything I want to have A) backed-up automatically and B) available anywhere.

Your free Dropbox account is limited to 2GB of data, and even for a storage hog like me this is plenty. And did I mention that the synchronization is fast? That's because it looks inside and synchronizes parts of files.

I use a notebook, a netbook, a Mac, and a desktop PC throughout every day for work. As I was typing meeting notes into OneNote on the netbook I noticed it synchronizing in the background. Every few minutes the alert on my desktop popped up that updates were downloaded. Talk about peace of mind. When I'm done for the day, I walk out with my netbook knowing all my machines have the same information on them. I can edit any file on any of the machines and they all just stay in sync about as fast as I can switch from one to the other. When I type a new article on the plane, as soon as I get home, my laptop syncs. In the span it takes for my Mac to boot up, my files are already being downloaded. I can get right to work knowing I'm using the latest versions of the files, hassle-free and worry-free.

For a while I was using the Gladinet software but it wasn't reliable, was too slow, and expensive. The GoodSync tools are really slow. In addition, it won't run on my desktop PC because I run Windows Server (boo!). With both of these, as with most of the sync solutions I've tried, the synchronization algorithm doesn't look inside the files so I was able to confuse the client with multiple updates and then lose some data. Inevitably this happened and pissed me off. And of course both of these cost money too.

Among other things, I keep a Box.net account, a SkyDrive, a private FTP server, and heaps of both Amazon and Google Storage. Most of this is for work and collaboration, and I use them for encrypted back-ups too. But none of these make moving my data around easier than Dropbox. If Box.net (my previous favorite), or any of the big boys wants to know how to do synchronization right, check these guys out.

And no, I'm not paid for this, I'm not affiliated in any way with Dropbox et al. I just rarely get the chance to toot the horn of something truly great and with such mass appeal.

Tuesday, March 02, 2010

The Gorilla in the Clown Suit

In my line of work, you deal with all sorts of different "leaders" and "managers". Typically, they've achieved their position through some confluence of skill, opportunity, and luck. They all have their positive and negative attributes, depending on the needs of the situation and your vantage point.

The one I find most fascinating though is the Gorilla.

The Gorilla type of leader is epitomized by being unruly and unpredictable, beating their chest when things are rough, blundering around leaving banana peels for others to trip up other people, and applying brute force to solve any problems they can't avoid by putting their hands over their eyes. Oh, they seem intelligent enough at first, but don't forget that their typical response to even minor irritations is to start flinging feces.

When trying to identify a Gorilla, watch for them to put their hands over their eyes. You will notice this by listening to how they handle ambiguous requirements. You will often hear them using phrases like "I interpret this to mean. . ." and "I think what was intended. . .". They will insist on asking everyone to estimate their completion dates, but are totally uninterested in keeping track of the actual work required to hit those dates.

Because they don't actually manage risks (or a traceable work plan) their project will inevitably have unexplained delays. They handle this through beating of the chest in more meetings about the dates and the issues. In these meetings they studiously avoid detailing the actual work involved or assigning any accountability. Unless of course, they think they can pin blame on some small monkey. At this point, they jump right on top of the unsuspecting ape and demand a full accounting.

While this chest-beating occurs, they routinely change the process and expectations. This creates wonderful opportunities for others to fail in meeting these new expectations or realigning to the new process quickly enough. These little banana peels are perfect for tripping up the unsuspecting. These little slip-and-falls usually provide them with several potential candidates for assigning future blame.

With all this mayhem occurring, the delays generally get worse, which means they now need to apply their brute gorilla strength in an effort to get things back on track. Since they haven't really been paying attention, and there are banana peels everywhere, this is harder than it sounds. So of course, now they get frustrated and voila! the feces flinging commences.

If you find yourself working with (or heaven-help you FOR) a Gorilla, keep these simple rules in mind and perhaps you'll do okay.

First, don't look them in the eye. Calling out that the endless meetings is a bad idea and unproductive is like poking the gorilla with a stick. It will only get you labeled as "uncooperative" or "not a team player". Instead bring your laptop and get your own work done if you have to be there in person. If possible, make it a conference call so you can put yourself on mute and get something done while the gorilla beats his chest.

Secondly, watch the ground for banana peels. If the rules are constantly changing, give yourself plenty of time to do lots of non-value-add work aligning and re-aligning. This means get your monkeys up in the trees away from the gorilla so they can do the real work. Keep them isolated so that if (or more likely when) you stumble into one, you're the only one that gets hurt and your team stays productive.

Lastly, find and maintain a poncho. Inevitably with a Gorilla there will be flung feces and finger-pointing. Make sure you've cataloged and highly publicized your risks and issues while maintaining transparency with your schedule, dependencies, and work plan. When things do turn nasty, keep your cool and trust in your poncho. If you've kept your poncho in good repair (being transparent and publicizing early and often) you should be prepared for the worst of it.

In closing, it's worth pointing out that Gorillas aren't all bad. They are especially useful in death marches and suicide missions. When what is being undertaken is just completely unreasonable or the situation will likely require someone's career if it is to be successful, a Gorilla is a great candidate. When the situation is already beyond repair and you are just trying to salvage something, then chest-beating and brute force are great strategies for maximum gain. As a leader of multiple teams, unleashing a Gorilla to work for you is a quick way to identify which teams have their feces together and which people on which teams are worth saving. Good resources will escalate around a gorilla, and good peers will already have their ponchos firmly in place, rendering the gorilla ineffective.

Thursday, February 04, 2010

Mostly Magic

Writing code can be highly scientific. Design software can be highly artistic. Defining architecture that encompasses elegant designs, is easy to test, efficient to construct, and still manages to satisfy constantly changing business goals is some science, some art, but mostly magic.

Let's face it, every code monkey likes the way they write their code. Conversely, they think the code someone else writes is mostly garbage. In the same way, all designers thinks their particular designs are elegant and demonstrate "best practices" while simultaneously denouncing other designs as grotesque or impractical. These two elements that are combined when defining the architecture of a solution are both highly inflammatory and subjective. So how does one allow individuals to contribute according to their skills and from their own perspectives without turning the architecture into a Frankenstein-like monstrosity?

The first technique is generally applied only at the design and code levels and is called Separation of Concerns. It is generally regarded to have originally been introduced by Dijkstra in a paper written in the '70s.

The general idea is that the various aspects of single effort should have their boundaries defined and respected in such a way that they can each receive focus individually. For example, separating how a screen displays information from the format in which the data is stored. Or agreeing on the bolt pattern for a tire so that tires can be manufactured by different suppliers and fit many cars. By creating a boundary between two areas of concern, they can each apply different ways to support the boundary independently.

While this principle is often touted and used a tool to flog the unsuspecting for the need for services, loose coupled interfaces, interface contracts, etc. it can just as easily be misapplied. Consider how there are many components in an automobile that require access to the electrical system. In software we often see these as secondary types of functionality such as providing access control, auditing activities, or styling a screen. When you have these type of cross-cutting concerns, they tend to flout the rules by which we create primary concerns. Which leads to the aforementioned architectural monstrosity.

In reality separation of concern is more than just for the elements, but it needs to be applied at all levels from how the work is structured, how the business processes interact, to how systems utilize shared resources. So working magic in your architecture will often involve applying this concept to more than just your designs, but will be present in your mindset and how you attack and decompose complex solutions into elements to be designed. Including the cross-cutting concerns as first-class concerns is definitely part of practicing this magic.

The second technique doesn't really have a formal name that I've been able to uncover. It essentially is realizing that there is going to be a difference between what is asked for, and what is needed. Generally, speaking we all think that providing good software is about giving users what they want. In reality, it's about giving them what they need in an efficient way and helping them feel like it's what they wanted.

Ask any user what they want and they immediately turn into children. We all know that every kid wants a pony. Girls want it to be white and gentle. Boys want it to be dark with a white streak and wild side that makes it run really fast. And there are a few crazy kids who go all out and decide they want a unicorn. They are users, they have no idea what goes into the lifecycle of acquiring, maintaining, riding, and eventually putting a bullet in the head of said pony. They just want all these things they associate with having a pony!

If you focus on giving them the pony, you are going to end up with really sad pony-owners. They'll have paid a premium for a beautiful pony that can't carry a rider and keeps growing until it's a huge horse that costs to much to keep. And who knows that they might have been totally happy with a cheap, durable, practical: donkey.

The magic isn't just in realizing that a donkey can meet the need. It's on selling the donkey to someone who thinks they want a pony by focusing on what they get instead of what they think they want. Sounds hard, doesn't it?

I told you it was mostly magic. And as an architect, it's what I practice every day.

Thursday, January 14, 2010

Why Is The Build, The Build?

Someone asked me recently about why I structure builds the way I do.

For background, I should explain that I have a reference SQL build that I've been using for many years. It uses command files (.cmd) to iterate over a directory structure and create (or re-create) a database. It has no dependancies, works with integrated or sql server security, logs outputs, etc. Pretty much the usual.

In this reference build, the structure is broken down so that tables are separate from keys and defaults and constraints and indexes, etc. Procedures are broken down by type, views are seperated similarly. The configuration scripts for users, data, etc. are all broken down as well. There are dynamic drop scripts that tear everything down before attempting to build everything back up.

There were four specific concerns that came up about why it's structured the way it is that were raised. And it took some time go through the rationale. I figured I would capture the details of the thinking for you in this post.

The first was about using granular scripts. After all, most tools, and many shops just merge artifacts into either a big script, or lumps of connected objects. For example, merging keys and constraints into tables. The second was about the use of the tear down scripts. If all the scripts are designed to create objects and not perform alters, why the need for tearing down en masse at the start? The third concern was about how we handle errors. Specifically, the build only outputs errors from every script, and doesn't actually stop the build. Lastly, the placement of indexes within the build, before the data load was questioned. Typical data loading would want indexes removed so that the load happens quickly and then indexes can be restored with the server online. These are all great questions.

To address these concerns we need to set the expectations for the build process and how it fits into a larger team effort. The core expectation is that these scripts will be used by an automated process deploying a database repeatedly into multiple environments as well as by individual engineers who want to deploy locally or alternatively as needed. It is also expected that the scripts will be managed by a larger body of engineers as opposed to a single engineer or team who will manage all scripts.

We can certainly discuss why having a single definition for the build or being able to leverage the larger team model for script development is advantageous or relevant at this level, but for this discussion these expectations are foundational.

Given these expectations it becomes important to understand as early as possible, as many errors as possible that exist in every build. Essentially, since there are potentially many people making many changes, all present in a single build, we would want to uncover all the problems in a single pass because the different issues have a good chance of being unrelated. For example, changing an index or a table default won't affect whether a procedure will build so we want to catch all these errors in one pass.

This also translates into loading the data before the indexes. If there is going to be a problem with the data because of an index, you would want to find it where the data is managed, not where the index is managed.

Further, by using granular files, individuals change independent elements like keys, constraints, defaults, and so forth and the success or not of that change will be tied to a particular file which can be tied to a particular engineer and a particular change. In short, we want to know about all the errors as early as possible, but we also want to be able to pinpoint the source and responsible party for every error as efficiently as possible too.

So a quick review:
  1. Use Granular Scripts - so we can detect all errors independently and pinpoint the source and responsible party as quickly as possible.
  2. Use Tear Down Helpers - so engineers who are working locally or in alternative environments can restore to known point as quickly as possible with as little effort as possible.
  3. Don't Stop On Errors - so we can find all errors independently with a single run of the build.
  4. Create Indexes Before Loading Data - so we can find errors in the data load and pinpoint the source and responsible party as quickly as possible.

Hopefully it is clear how the specific patterns support the expectations.