Friday, March 12, 2021

social network censorship

OK, I want to break something down at this point. Social networks, web applications, newspapers, and other media outlets are not "the government".

Why do I make that statement? Because I keep seeing people crying "wolf" about how facebook is violating their free speech rights. This is 100% untrue (right now) and opens up a thorny debate that has been around since people were dialing up on 1200 baud modems to bulliten boards in the 1980s. Here's the root of the problem/question:

if somebody posts illegal content, who can be sued/go to jail for it?

In the olden days (before section 230) it was (for the most part) "anybody and everybody involved". This means, if you uploaded...IDK kiddie porn, copywritten materials (music, books), legit conspriators or contrive to overthrow the government are all examples everyone seems to use... the creator of the content, the place they uploaded it to, the phone company, anyone who downloaded it, and you name it could be sued/jailed.

Thus, back in the olden days they created a law that protected (somewhat) intermediaries who are simply "platform providers" some legal protection from liability in the case illegal content was put on their platform. This solves, however only HALF the problem...sure myspace can't necessarily be sued for hosting illegal music uploads, but now the music industry isn't protected from folks pirating their product. So the "other half" protects folks who are effectively being ripped off (think pirated music/movies) by giving content providers the right to moderate (by taking down illegal content) materials uploaded without the risk of the person uploading suing them for "getting rid of their content".

The conundrum around the current situation, however, is..."who gets to decided how content should be moderated?". Right now that's in the hands of the platform provider (facebook, google, whomever) and the problem is, if they decide a bunch of Antifa or Proud Boys posts are in violation of their own terms, they have the right to remove the content, ban the user, or...really do anything that want (including nothing).

So the problem becomes thorny...at this point facebook could take every "pro biden" post (well there are logistical problems, but that's a different issue) down and other than the poster fuming about it (unless they were banned) nobody would know. The upside is that there are market forces at work because facebook makes money from advertising "pro MAGA" materials to the proud boys and "BLM" material to BLM supporters...so they need to keep some of that material to pay the bills. (how can you shill MAGA hats and #BLM t-shirts if you block all thier posts?)

At the end of the day, there is I think emerging awareness that "the system" as we know it around these digital content and social platform has some pretty serious flaws and suspect in the next few years they will start to be regulated a little more closely. I don't think section 230 will necessarily be rolled back, but there will definitely need to be some adjustments in order to both maintain a free and open internet, but also hold companies that profit from divisive and objectional content being posted by third parties accountable for fostering a potentially toxic environment.

Tuesday, January 5, 2021

There was an old woman who swallowed a fly [anti-pattern]

On many occasions I find myself humming this tune:

I'm sure there's another name for this anti-pattern, but it's a variation of yak shaving in which each progressive solution to a perceived problem becomes more imperfect and workarounds and unlikely solutions are progressively applied. Ultimately, the question often forgotten is "what was the original problem?"

I'll give the technical verision of what happens:

  1. Someone discovers that logging in to a web application isn't working for some users
  2. it's discovered the service that accepts userid and password is returning an error for what we believe to be a valid userid and password for these users
  3. digging further it's discovered that the code to validate userid and password is calling another service to load a user's customer record and this service is returning a new error
  4. when researching this customer service, it's discovered a new error started 2 weeks ago that seem to impact a small percentage of the user base
  5. when researching these particular users, it's discovered they were all created within the same 2 day window (years ago)
  6. when researching what happened at that time, it's discovered that there was a deploy that happened before and after the failing records were created
  7. Upon further investigation, it's discovered the first deployment introduced code that caused a different problem and the second deploy backed out a set of changes
  8. That it turns out, in addition to the problem that was known, caused the new problem
and so on and so on...

The reality is that this can progress in such a way that each new discovery requires some sort of change to diagnose and trouble shoot that ultimately causes new unknown problems that, when discovered, will have no clear path back too why the change was introduced in the first place and no way to know if it should be reversed. Worse yet, the original problem that you set out to solve is lost and often even forgotten.

The anti-pattern part of this is essential the "dark side" of following the Boy Scout Rule, which is too always leave things in a better state than when you arrived. The negative part of this is that it can be diffiicult too ignore little problems, but often because there are so many "little problems" that it defocuses your effort and attention from the "original problem".

In short, it is important to remember the task at hand and stop and think about "is this more important/necessary than what I was originally trying to do?" coupled with keeping track of "what did I set out to do and does this activity get me closer to my objective or not?".

Thursday, September 17, 2020

What the heck is source code, environments, and versioning for non technical people

Had an interesting conversation today and thought I'd share some insight to business people dealing with technical folks

The crux of the question centered around a request we were making for "read only access to a staging environment". I pondered why that line item was there because from a technical perspective what we needed was access to the source code for a web application, we really didn't need access to anything in the environment (though it wouldn't hurt). Moreover, with only access to the environment, we wouldn't necessarily have access to the source code so the actual need wouldn't even be met.

In conversation, a light began to come on in my head realizing that "source code" and "environment" are almost meaningless to many non-technical people, and in todays "As A Service" world, sometimes they get mingled together.

Brief Overview

Sourc Code is essentially the instruction that tell a computer "what to do". So, for example: if date > today display "date must be today or in the past" is source code. What happens on many platforms is that those text instructions translate to a string of "0's and 1's" that tell the computer how to do this. so the instructions above might be translated to: 00010101010010101010010101001010000000101010011011110101010101011100000101010101010101001010101010010101001010010100 and the computer, when you feed that string of 0's and 1's to a computer, it will do what the programmer intended the textual instructions "should" do. This string of 0's and 1's is colloquially called the "binary" by tech weeenies.

Now for some wrinkles...this translation is known as "compilation" or...in some languages "interpretation". So languages like C, Fortran, or Go are "compiled" and languagees like Ruby, Javascript, or Python are "interpreted". And (as a side note) languages like Java are actually a hybrid of both. There are literally hundreds if not thousands of programming languages and all of them use some degree of compilation or interpretation (even if they are a visual language). but the important detail is that the "source code" isn't necessarily the same "file" as the "binary". A simple way to think of it is if you're using a computer program and you take a screen shot of a powerpoint presentation, you can then hand that screen shot to someone else (or post online, or whateever) and continue to edit the source code (and even store the sourcee code and binary (the screen shot) in different locations.

So for example, using our powerpoint example, you could be creating a presentation, take a screen shot and post a "work in progress" on the web, while continuing to edit the powerpoint presentation (the source code). Moreover, you may save "versions" of the source code (and/or binary) so that you can explore different fonts/layouts/colors, and display/edit them differently in different "environments"...maybe one environment is an internal web site with a bunch of unbranded pictures, but there's another public site with the final product.

So what?

So putting it all together using our powerpoint example.
  • Source Code = Our PPT file that we edit to create screen shots to display elsewhere
  • Binary = Our screen shot of the presentation at a particular point in time
  • Version = A particular revision of either the PPT or the Screen Shot

OK, got it, but again, so what?

Because anything beyond a trivial Hello World program will have multiple versions, with perhaps diffeerent variations that change over time. So when building things like ecommerce sites (or almost any non-trivial app), folks need the ability to test and validate new features before turning the feature on for the world to see. In our powerpoint example, there might be a QA or workflow to validate the resulting image from the powerpoint uses the right fonts/colors/branding before displaying to the public. Becausee of this need, most modern platforms have the notion of different environments for different purposes, some common examples are:
  • "Local" - the developer local machine, they can only see the code and the images
  • "Development" - the environment where all the developers can see each others work put together. sometimes there can many "development" environments, depending on how complicated the solution is, but the point is, it's a remote environment or version that isn't exclusive to a single developer
  • "Staging/Integration/QA" - these are other environments used for a variety of purposes sometimes they don't exist, sometimes there are dozens of these.
  • "Production" - this is where the world gets the final product
The process of moving the binaries between these environments is generally known as "deployment" and the workflows around deployment are myriad, but the point is that once you've created a version, you move that version between environments.

OOOOOhhhh Kaaaay, I think I got it, so what's your point?

So, the confusion arises because of something I mentioned earlier about "interpretation" and "compilation". In our powerpoint example, an interpreted language creates the screen shot automatically in the environment when someone tries to view the content. In a compiled language, the picture is created ahead of time and only the picture moves between environments (it might be tied to a version of the original PPT, but this is only a loose association).

So for example. Suppose I have a PPT called "Mikes_presentation.ppt" and while building it I create 3 versions "Mikes_presentation_v1.ppt", "Mikes_presentation_v2.ppt", and "Mike_presentation_v3.ppt". In this, I have 3 versions of the source code, and for the purposes of this discussion I store them on my local machine...so I have 3 files. Furthermore, let's say I want to take a screen shot of each of these and I want to send them to someone to take a look (maybe they don't have ppt) and I put them out in three different places...two of them are "For internal use only" and the last one is "for the world to see"...so I might put one at "preview.mikemainguy.org", "earlyacess.mikemainguy.org", and "www.mikemainguy.org"...let's just pretend those are web sites or "environments". At any given point, if I point someone to those "envirornmeents" they might see a screen shot of any of the versions of the source code because I've deployed different versions to the environments.

However, for "compiled" versions, I may (and routinely would) only send the screenshot to the environment, because the environment itself doesn't need to know anything about the original PPT, it just needs to be a picture. So if I wanted someone to edit that PPT, or enhance it, access to the file in the environment won't be useful because I can't change the original PPT used to generate the picture.

For "interpreted versions" all someone needs is acess to the environment, and if you don't have additional controls, they might edit the PPT that I called "Mike_presentation_v3.ppt" to have completely different content than the one sitting on my hard drive.

OK, is that good or bad?

It's honestly neither, but it does illustrate (I hope...I know it's been a bit of a ramble) that access to the "environment" doesn't necessarily give you access to the "source code"...and an "environment" might not actually reflect what your "source code" (or the copy with the same version) can generate.

So honestly what's the big deal

Well, it gets confusing because some tools (the "interpreted" examples) inherently store the source code in an "environment". This means "environment equals source code", but other tools (the "compiled" examples) don't necessarily equate the two. It's furtheer compilicate by the fact that this is a trivial overview and reality is MUCH more complicated (some "compiled" code is also store in the environment, "environment" also includes things like operating systems, device drivers, networking...so the source code doesn't necessarily give you everything you need to reproduce it).

At the end of the day, hopefully I gave a (not so) brief primer to the semantics behing "source code", "environments", and "versioning"... and my full apologies to all the folks who will be coming out of the woodwork to explain the million different ways this is technically not 100% correct...to them, I just say "the business people don't care, we just need a better way to explain the concepts.

Programming languages by type

Friday, July 10, 2020

Amazon busted?

Trying to find a battery on amazon and no matter what I try, I get this (logout, login, incognito):

Thursday, December 5, 2019

MQTT/AMQP design implications

If you're working with embedded devices or telematics solutions, you might be hearing noise about a fairly new protocol called MQTT. It's a relative newcomer to the network protocol world having been invented in 2009 and first published in the wild in 2010, it is roughly speaking to tcp based binary wire protocols what SPDY is to HTTP.

At the heart of the protocol and why you might use it instead of say..AMQP is its simplicity. There are only 5 operations that must be implemented, it's wire format is minimal, and because of it's simplicity, it is theoretically able to use less power.

A close examination of the differences between AMQP and MQTT show that the low power or low memory devices (think Arduino class) will certainly be more likely to easily speak MQTT rather than AMQP. As an example of how an ideal architecture leveraging the strengths of each protocol might look, take a look at the following diagram:

When looking at this stack, let's talk about the implications of this approach over a SPDY/HTTP implementation from the device perspective.

For devices living in a low power lossy environment (on the right) using MQTT makes a lot of sense. If you periodically transmit 10 bytes and need to know if a device is connected or not...as well as maintain a small footprint for the libraries doing the connection management, MQTT wins hand down versus AMQP or HTTP. On the other hand, once these messages are delivered to an MQTT broker it becomes more important to handle message queuing, reliability, and a host of other things that an embedded device typically won't have the power or inclination to manage. Additionally, in a low memory/power situation, maintaining application message level transaction state for the life of the operation is often rife with error

In short, it seems for many use cases a combination of these protocols is generally going to be the "best" solution, not one or the other by themselves.

Friday, August 23, 2019

The dark side of git...

Git is Great!

As a distributed source code tool, git is great. I love that when I'm on an airplane I can commit code without a wireless connection and have be able to unwind what I was doing. It was clearly designed with the "offline" model in mind. The idea I can create a quick branch, experiment, make massive sweeping changes, and just drop the whole thing if I realize it sucked...is AWERSOME! For a fact this puts it ahead of it's open source predecessors (namely...SVN, CVS, and RCS).

But perhaps a victim of it's success

What I observe, however, is that a lot of folks have taken up a development model where "everything is a branch" and we end up with roles like "pull request approval engineer" (not a real title, but if you end up doing that job, you'll know that your doing it). This problem happens when the number of public branches/forks reaches a count and complexity that far exceed any value they could have possibly served.

What is productivity?

I'm going to take a somewhat unpopular stance here, but in general my stance is that branches are antiproductive... before everyone gets their pitchforks out, let me explain my version of "productivity" for a software project. Productivity is producing software that accomplishes the business purpose at a marginal cost that provides positive value. While that might have been wordy or even technically incorrect, the overall equality formula I wan to use is: sum of all activities to product the software must be less than the value the software provides. In other words, if it costs 2 million dollars to build/deploy some software, but the business can only recoup 1 million dollars in value (via cost saving or new sales or whatever) the I would consider that a failure.

The branching use case

As a software engineer, I want to create a branch so that other developers cannot see my changes in their builds.

Well that sucks because:

  1. First of all, the activity of creating the branch, merging in everyone else's branch to your branch (through possibly a different branch) is all stuff that you would get instantaneously for free if you were all working on the same mainline.
  2. Second, you're deliberately delaying visibility of changes from the rest of the team...which means the whole notion of continuous integration is getting thrown out the window
Which brings me to a key question

Are you operating with agility or fragility?

I would contend if you're branching for every feature or bug and merging them back in, your codebase and/or process is more fragile than agile.

Your Thoughts?

Saturday, January 13, 2018

The state of programming languages and frameworks

As a professional software delivery person, I like to keep on top of technology trends and "where the market might be going". Over the last decade and a half, quite a few languages and frameworks have come and gone and very few have had any real staying power. In order to be marketable and knowledgable in things that "people want to know", I generally find the Tiobe index and Google Trends to excellent resources in gauging popularity. In my analysis this year, I've established that relatively speaking, they are in agreement, so I'm going to use google trends (as the charts are easier to embed) to elaborate.

Programming Languages

Before digging into frameworks, there is the notion of "which language" is most popular? In this regard, java has been dominant and looks to remain so for a long time. While there is a downward trend, every major language has had it's mindshare diminished, I can only imagine because of the explosion of alternate languages in recent years. Assessment: learn java, become an expert because while the market is crowded, there will always be work and/or people who want to know something about it. To be clear, I disregarded C, though it does roughly correlate to C++ in popularity...it is used more in embedded markets and that's not one I'm deep into [yet].

Alternate languages

While I would recommend any newcomers pick one of the "big 5". It really helps to have a "specialized" language you are at least passingly familiar with and can be productive in. In that regard, I also tend to take the "short term" view as these tend to come an go with great regularity. In that regard, I'd say that Python (technically in the big 5 if you go by many sources) is a solid first choice, but ruby is still a viable alternative. Outside those two, almost any other modern language would be a good idea to pick up and have as there are always specialty areas that will have a need [even for legacy languages like ADA or Fortran].

Legacy Languages

One area that is often neglected are so called "legacy languages". These are languages that have fallen out of style and/or been superseded by more modern alternatives. One reason I recommend adding a member of this group to your portfolio is that many experts in these fields are retiring but the systems running on them will continue to live on. Additionally, when doing a migration from a legacy platform, being able to quickly be able to read and understand what the old platform did is a valuable skill. One area to look at is the "area under the curve" as this represents the "amount of code potentially written". In this regard, perl is a clear winner.

Frameworks

Programming languages, however are only one dimension. Beyond this, the frameworks available to deliver higher level functionality are a key factor. From that perspective, I grabbed a few notable frameworks and did a comparison (realizing node.js isn't really a framework). In this regard, ruby on rails, while declining in popularity (and surpassed by spring boot), has a HUGE installed based and would clearly be a good choice. The winner's a little unclear here, but coupled with java's popularity as a language, I think one would not go wrong with spring-boot, perhaps having ruby on rails as a backup (and it IS the dominant framework in ruby).

Conclusion

From my perspective, I have a good familiarity with java and spring-boot, plus a deep understanding of ruby on rails...so I'm still fairly well positioned and I think I could easily recommend these as "go to" choices. Beyond those, I think I may spend some time playing around with perl again as it strikes me as a market that is set to be underserved at some point in the next 5-10 years...and will be a prime candidate for "need to know to make legacy migrations go smoothly".

social network censorship

OK, I want to break something down at this point. Social networks, web applications, newspapers, and other media outlets are not "t...