Tuesday, September 22, 2015

The (slightly tongue in cheek) role of the database administrator

As a former DBA, I find a disturbing trend toward a value proposition that is almost nonexistent among a recent crop of database administrators. Maybe having some background and/or working with other stellar DBAs in the past has spoiled me, but here's the workflow I've find more and more common.

scenario - production application has slowed down for a few transaction types, dynatrace shows a critical sql statement has slowed down. None of the development team has access to run explains, we can't "afford" hardware to load the production dataset into another environment (because we're using a DBMS that costs 13.6 bajillion dollars per CPU nanosecond with an additional upcharge of 1 million pounds sterling every time we execute a query that uses DML... Explains indicate everything is optimal in the lower environments. The decision to use this particular platform after the salesman for the product took the DBA team to Vegas for a "conference" and after tough negotiations (forever documented by the excellent 'based on a real life story' movie: "The Hangover").

Step 1 (optional) DBA team notices the query is slow also...email from DBA team to application team "Dear application team, while eating donuts and sipping single malt whiskey this morning, I accidentally hit a button on this funny looking device sitting on my desk and the monitor in front of me popped up a report indicating this sql is pretty slow, thought I'd let you know. Please fix ASAP as I believe we're wearing our our disk spindles and they're VERY expensive. Also, do you know how to close the window with this report...it overlaid the Rugby World Cup streaming live in HD and I really want to see the 'All Blacks' win!"

Step 2 (optional step one) email from application team to DBA team "Dear DBA team, we've (also) noticed this query is very slow. In every other environment it runs fine (less than 5 milliseconds), and it uses the primary key on every table, we're unclear why it takes 10 minutes to complete production. Can you investigate?"

Step 3 email from DBA team to application team "Dear Application Team - as I mentioned before, this query is slow and my children's college fund is contingent on our database using less than 20 IOPS under peak load, fix this IMMEDIATELY! upon investigation I think if you removed that query the system run much better...I will note that your application is causing our database to use a lot more cpu and IO than when we initially powered the system up. Obviously you don't know how to write software as prior to your application coming online the 3 'hello world' applications we used to test the database platform didn't cause any problems like this. Just because you collect data from sensors around the globe and provide real time data to thousands of users simultaneously doesn't mean you can just slap crappy SQL in and expect it to run well. Please let us know if you need any assistance writing application software as we're clearly much more intelligent than you."

Step 4 followup email from DBA team to application team "Dear Application Team - we further noticed you're heavily using the database during our backup window from 1am to 8am eastern. It's critical the system not be used during this time window as we have a team of interns backing up the entire database on 3.25" floppies. Please tell your users not to use the system during this window or remove these SQL statements ASAP. Also, do you know a good torrent client? I really need to catch up on the last season of 'Game of Thrones' PS did you see the latest news on CNN, evidently a lot of users of your system are complaining about about performance during peak online shopping hours in Europe...Hope you figure out what you did wrong when writing your application software."

Step 5 application team changes system to use BDB running on an old android device connected to the internet via a 2g wifi hotspot....sets up elaborate VPN solution to enable application servers to use this database from production data center...problem goes away

Step 6 entire DBA team is promoted for "saving the company millions of dollars by optimizing key SQL queries

Evidently, this is the new normal. Apologies to any DBAs who might, in fact, have been more helpful or proactive in helping to solve the problem.

Tuesday, September 15, 2015

How to design a useful javascript framework

Based on my highly scientific analysis, there are currently 13.98 javascript frameworks per javascript developer. I've personally been on two projects where a framework was built, completely scrapped, and rewritten BEFORE THE PROJECT WAS DELIVERED! Based on this observation and the literal wasteland of half baked frameworks available, I'm sharing some insight on how to design a useful framework. Follow these rules and you'll have a higher probability of having something useful, ignore them, and well...I guess that's your choice, I won't hate you for it (OK, maybe I will a little).

Step One: Pick an existing framework

No, this is not a tongue-in-cheek joke, this is reality. Unless you initially demonstrated your framework at Bar Camp back in 2006, you should start with something that already exists and first deliver your project with that.

Step Two: Find things that the chosen framework doesn't do well

Now look at your product that is "code complete" and do an analysis of where your most common bugs happen, where new developers trip up and make mistakes, or where the code is repetitive. If nothing stands out...STOP, you're done. If there are rough edges, analyze approaches to the rough edges, and see how other EXISTING frameworks solve the problem. If nobody's solved it or if you think you've found a better way, refactor your code and enhance your chosen existing framework.

Step Three: After doing this for 4 to 5 years, and you've found better patterns or something novel, write a framework

Note, this step seems to be the one everyone skips (even authors of currently popular frameworks). Experience is important, attempting to write a framework after doing a "TODO" list app because you discovered something you don't like is a recipe for disaster. Moreover cross posting your new framework announcement across the internet to "make a name for yourself" is irritating and counterproductive.

Step Four: Write some useful applications using your framework

If you're starting a new javascript framework and haven't USED it, your chances of building something that is generally better than what already exists are vanishingly small and you're likely expending energy on something that, will not advance the state of the art by anything remarkable and is much more likely to be a step backward. I don't mean to discourage creativity, and still encourage folks to experiment and try things out...but temper your enthusiasm with reality until you can clearly illustrate how your new framework is better. In addition, be sure to look at your baby objectively...warts and all.

Thursday, September 10, 2015

Why I'm pushing your buttons

Interesting Story (to me)

I found myself in another interesting "ex post facto" software design review session and both sides of the table were getting increasingly frustrated. My frustration centered around my engineer's inability to explain "why" he did it that way or "how" it worked, and his frustration was a mystery to me. I suspect this has to do with the perception that perhaps I was telling him his baby was ugly.

I think what leads to this situation is the realization that I'm an almost negligent delegator. Yes, most who interact with me or know me professionally might not think it's true (as I DO like to get my hands dirty). I tend to give my tech leads and developers lots of rope which unfortunately means it will fit quite easily around all of our necks. This is, however, deliberate because I feel this historically has yielded the most innovative results and "generally" produces really good or really bad software. It averages out (IMHO) to "Better than average" software, and when taken with the notion that I then end up with a pool of REALLY good engineers that I can throw at fixing the "really difficult" stuff...I feel I usually end up with "above average" solutions (for long term engagements).

That having been said, there is a particular problem that this approach produces. That is, when someone takes an innovative or creative approach that isn't well thought through, we get to a situation where design guidance wasn't given early enough. Moreover, things that I could help forestall early on are then "too late to fix". In general I'm OK with this, yes it's frustrating to everyone involved, but frankly I have to be honest and say that it's deliberate. It's my approach to developing software engineers by allowing them to make and realize mistakes that has proven to work from a professional development as well as overall software quality perspective.

If you're ever in some sort of design review and I'm asking stupid or (better/worse yet) super challenging questions and questioning your every tiny detail... trust that I'm doing it not because I don't believe in your solution or don't believe your solution is "good enough" or don't believe you did a "good job", but that I want us all to do even better next time. I'm not being a dick or trying to be "Mr. Smarty Pants", I'm simply trying to help us both get better. When I say "I don't understand how this works" or "I don't think that's a good idea" I'm not saying you did a bad job, but truly just want a better understanding. I WILL say that if you've been doing this less than 20 or 30 years I may have some experience with problems you might not yet have run across and hope you will give me the benefit of having some reasonable doubt in your own capacity and experience. But I will also reserve judgement on your approach or it's quality until I have as complete an understanding as I can in the time available to me to review it.

In short, never assume I'm challenging your software because I think you (or your code) are inferior or not well thought out, but think of my challenges as an opportunity to challenge your own preconceptions and as an opportunity to grow yourself. At the end of the day, I hope to become smarter...but can only do that by accepting the notion that I might be wrong...you should do the same.

Tuesday, September 8, 2015

Real world Internet of Things

Having spent some times working with devices connected via GSM, I'd like to share some observations that seem to be obvious to cellular network engineers but get lost in the breathlessly overhyped echo chamber of marketing. The short assessment is that depending on the mobile nature of your connected device, the allowable delays, and the amount of data your intend to transmit and receive, you will need to very carefully choose your protocol and state management.

To begin, there are two major types of connected "things": #1 Mobile "things", such as trains, planes, and automobiles. and #2 Immobile "things" such as thermostats, refrigerators, and buildings.

Mobile Things

Mobile things have two unique problems you'll need to be concerned with, they are: #1 Speed, #2 Network availability

Speed (and/or velocity) impacts network connectivity because it introduces signaling problems for the radio network. Rapid movement or changes in direction can greatly impact packet loss and greatly reduce the effectiveness for protocols like TCP. If your device moves quickly and/or changes direction/speed often, you'll likely want a UDP or RTP based protocol to allow customization of how you deal with packet loss and delay. This also means that you're going to want to reduce the size of each message, but potentially increase the frequency and develop novel ways to handle losses. If you design your device to just use a TCP connection and delegate all this work to the networking stack (without serious TCP tuning) you're going to have a lot of problems most network engineers (outside wireless telecom) are just NOT used to dealing with. Be prepared for sleepless nights and sporadic failures if you use TCP.

Network availability introduces a similar problem as your device may enter and leave areas where is cannot communicate "at all". As above, your protocol needs to have definition around "what do we do when my device falls off the network for a few minutes/days/hours?". This is a big deal when using TCP based protocols because most connection based protocols account for this by waiting around to see if the connection can be reestablished, then using retry mechanisms to "guarantee" delivery.

A major failure folks tend to have with mobile things is that they tend to test their devices in unrealistic controlled environments (stationary in a lab) and rate their performance based on these criteria. When the devices begin operating in the real world, the factors above rear their ugly heads and the reliability of the overall system is severely impacted.

Stationary Things

Stationary things are arguably easier to deal with since, once connected, they don't have to deal with the problems mentioned above for mobile things. More importantly, they are generally tested in a manner similarly to how they are deployed...that is, stationary and with network connectivity. There is a problem that stationary things have that is generally much more problematic, which is shared with mobile things, but aggravated...namely:

"What do I do if I cannot get on the network?"

When designing a stationary thing, how can you handle a device that is deployed in a location with poor or nonexistent connectivity? Generally, a stationary device will need more connectivity options (3g, 4g, wifi, X.25, satellite, ethernet, bluetooth) as once someone has placed the device, it is unlikely that it will roam in and out of cellular connectivity (if it can't get a connection, it will never get a connection).

As more devices become connected, the importance of designing network protocols and tuning the stack to the device's attributes become more important in everyday life.