What the heck is source code, environments, and versioning for non technical people

Had an interesting conversation today and thought I'd share some insight to business people dealing with technical folks

The crux of the question centered around a request we were making for "read only access to a staging environment". I pondered why that line item was there because from a technical perspective what we needed was access to the source code for a web application, we really didn't need access to anything in the environment (though it wouldn't hurt). Moreover, with only access to the environment, we wouldn't necessarily have access to the source code so the actual need wouldn't even be met.

In conversation, a light began to come on in my head realizing that "source code" and "environment" are almost meaningless to many non-technical people, and in todays "As A Service" world, sometimes they get mingled together.

Brief Overview

Sourc Code is essentially the instruction that tell a computer "what to do". So, for example: if date > today display "date must be today or in the past" is source code. What happens on many platforms is that those text instructions translate to a string of "0's and 1's" that tell the computer how to do this. so the instructions above might be translated to: 00010101010010101010010101001010000000101010011011110101010101011100000101010101010101001010101010010101001010010100 and the computer, when you feed that string of 0's and 1's to a computer, it will do what the programmer intended the textual instructions "should" do. This string of 0's and 1's is colloquially called the "binary" by tech weeenies.

Now for some wrinkles...this translation is known as "compilation" or...in some languages "interpretation". So languages like C, Fortran, or Go are "compiled" and languagees like Ruby, Javascript, or Python are "interpreted". And (as a side note) languages like Java are actually a hybrid of both. There are literally hundreds if not thousands of programming languages and all of them use some degree of compilation or interpretation (even if they are a visual language). but the important detail is that the "source code" isn't necessarily the same "file" as the "binary". A simple way to think of it is if you're using a computer program and you take a screen shot of a powerpoint presentation, you can then hand that screen shot to someone else (or post online, or whateever) and continue to edit the source code (and even store the sourcee code and binary (the screen shot) in different locations.

So for example, using our powerpoint example, you could be creating a presentation, take a screen shot and post a "work in progress" on the web, while continuing to edit the powerpoint presentation (the source code). Moreover, you may save "versions" of the source code (and/or binary) so that you can explore different fonts/layouts/colors, and display/edit them differently in different "environments"...maybe one environment is an internal web site with a bunch of unbranded pictures, but there's another public site with the final product.

So what?

So putting it all together using our powerpoint example.
  • Source Code = Our PPT file that we edit to create screen shots to display elsewhere
  • Binary = Our screen shot of the presentation at a particular point in time
  • Version = A particular revision of either the PPT or the Screen Shot

OK, got it, but again, so what?

Because anything beyond a trivial Hello World program will have multiple versions, with perhaps diffeerent variations that change over time. So when building things like ecommerce sites (or almost any non-trivial app), folks need the ability to test and validate new features before turning the feature on for the world to see. In our powerpoint example, there might be a QA or workflow to validate the resulting image from the powerpoint uses the right fonts/colors/branding before displaying to the public. Becausee of this need, most modern platforms have the notion of different environments for different purposes, some common examples are:
  • "Local" - the developer local machine, they can only see the code and the images
  • "Development" - the environment where all the developers can see each others work put together. sometimes there can many "development" environments, depending on how complicated the solution is, but the point is, it's a remote environment or version that isn't exclusive to a single developer
  • "Staging/Integration/QA" - these are other environments used for a variety of purposes sometimes they don't exist, sometimes there are dozens of these.
  • "Production" - this is where the world gets the final product
The process of moving the binaries between these environments is generally known as "deployment" and the workflows around deployment are myriad, but the point is that once you've created a version, you move that version between environments.

OOOOOhhhh Kaaaay, I think I got it, so what's your point?

So, the confusion arises because of something I mentioned earlier about "interpretation" and "compilation". In our powerpoint example, an interpreted language creates the screen shot automatically in the environment when someone tries to view the content. In a compiled language, the picture is created ahead of time and only the picture moves between environments (it might be tied to a version of the original PPT, but this is only a loose association).

So for example. Suppose I have a PPT called "Mikes_presentation.ppt" and while building it I create 3 versions "Mikes_presentation_v1.ppt", "Mikes_presentation_v2.ppt", and "Mike_presentation_v3.ppt". In this, I have 3 versions of the source code, and for the purposes of this discussion I store them on my local machine...so I have 3 files. Furthermore, let's say I want to take a screen shot of each of these and I want to send them to someone to take a look (maybe they don't have ppt) and I put them out in three different places...two of them are "For internal use only" and the last one is "for the world to see"...so I might put one at "preview.mikemainguy.org", "earlyacess.mikemainguy.org", and "www.mikemainguy.org"...let's just pretend those are web sites or "environments". At any given point, if I point someone to those "envirornmeents" they might see a screen shot of any of the versions of the source code because I've deployed different versions to the environments.

However, for "compiled" versions, I may (and routinely would) only send the screenshot to the environment, because the environment itself doesn't need to know anything about the original PPT, it just needs to be a picture. So if I wanted someone to edit that PPT, or enhance it, access to the file in the environment won't be useful because I can't change the original PPT used to generate the picture.

For "interpreted versions" all someone needs is acess to the environment, and if you don't have additional controls, they might edit the PPT that I called "Mike_presentation_v3.ppt" to have completely different content than the one sitting on my hard drive.

OK, is that good or bad?

It's honestly neither, but it does illustrate (I hope...I know it's been a bit of a ramble) that access to the "environment" doesn't necessarily give you access to the "source code"...and an "environment" might not actually reflect what your "source code" (or the copy with the same version) can generate.

So honestly what's the big deal

Well, it gets confusing because some tools (the "interpreted" examples) inherently store the source code in an "environment". This means "environment equals source code", but other tools (the "compiled" examples) don't necessarily equate the two. It's furtheer compilicate by the fact that this is a trivial overview and reality is MUCH more complicated (some "compiled" code is also store in the environment, "environment" also includes things like operating systems, device drivers, networking...so the source code doesn't necessarily give you everything you need to reproduce it).

At the end of the day, hopefully I gave a (not so) brief primer to the semantics behing "source code", "environments", and "versioning"... and my full apologies to all the folks who will be coming out of the woodwork to explain the million different ways this is technically not 100% correct...to them, I just say "the business people don't care, we just need a better way to explain the concepts.

Programming languages by type

Comments

Mike Mainguy said…
Holy crap, I changed my template and now my admin interface has disappeared...guess I'm not as technical as I thought :o

Popular posts from this blog

Please use ANSI-92 SQL Join Syntax

the myth of asynchronous JDBC

The difference between Scalability, Performance, Efficiency, and Concurrency explained