Containers: Sticking to the basics… Part 1

  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  

If you are here and reading this blog then I assume that you are somehow directly or indirectly related to the magical field of computer science. I used the term “magical” however, there is no magic on ground zero. It is only some simple basic logic which appears to be magic. One such magic of computer science is containers. You might have heard this term very often until and unless you were on an isolated island for last one decade. Enough introduction to make you comfortable with my writing. Let me get to the business and write agenda of this blog post, which basically means what new are you going to learn through this blog post if you have zero knowledge about containers.

Firstly, like every other blog post we will start with the a basic introduction, we will answer the questions related to “what” like what is container? Then in next section we will answer the questions related to the most important wh word in scientific community I.e “why”. Why is a container made? Then the next blog post would show the simple basic logic of containers and answer the questions related to “how”? How is the a container made? Spoiler Alert!! Next blog might go bit long and technical, but that is where the fun part lies.

What?

“A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.” says official Docker site. When I read it, I say to myself “All right. I understood. It’s a software whose code and dependencies are bonded together. But when first I used LXC containers I got a terminal where I could do my choice of thing. Looks pretty similar to VMs. If I was smart and rich enough then I could have even nuked with containers. But the definition does seem so. So what is a container? Are they doing some lightweight system emulation through software? Or is it a mechanism of fooling kernel by doing something geeky?”. It looks like another magic!! It appears like a machine but actually it is not. Confused!!!!

Firstly, it must be crystal clear that container has nothing to do with emulation. Let us understand this with the analogy of apartment complex. All apartments in a complex share common land, water, security etc. However, each apartment is very well isolated from each other. Each apartment has a limit on water usage, parking area, electricity usage etc. Now, I live in apartment. My rights are limited only up to my apartment. I will think my apartment as my home not the apartment complex.

Apartment in complex = Container in OS.

The bash terminal which we get is actually a process whose limits are well-defined. It can be said as operating system kernel forked a process, defined and limited its file system(like say OS has made /some/random/path as root i.e / for the process) and said you are the first process in your world. The limits of the processes are well-defined. The process can’t see any other processes running on the host machine. If containerised process spawns a child then the child process can be seen by the containerised process and vice versa, but the child process will have the same limits as that of its parent.

Containerized Process

For an example, if you have worked with containers. Started a container with terminal. Ever tried running ps aux in container’s terminal? You will see the bash process as root with pid 1. ps aux will be the child of bash process with pid >1. Processes need resources to perform its task. The resources are provided to processes inside container by the operating systems kernel. If I am not sounding too technical and you are still understanding me then one last thing to understand is, what if the containerised process forked by the operating system kernel is from a binary, inside the file system specified for the containerised process and the dependencies for the containerised process are relative. Yeah, of course I have a process running isolated but appearing like a machine to me in case I start terminal process.

Containerized process is init

Jessie Frazelle, a Docker maintainer defines container as a “combination of different ingredients that are baked into the Linux kernel and we use those to create this concept of a container”  I would use this definition, slightly redefine it and put it as “combination of different Linux kernel features that are used together and we use those to create an isolated and lonely process whose resources are very well-defined which everyone calls as container”. If you are wondering which Linux features, then I would ask you to have faith we will come back to it in next post. For now things to be remembered by you is that container is an isolated lonely process.

Does that mean docker’s container definition is incorrect? A big NO. the standard unit of software is a process. Yeah processes inside containers are fast. One computing environment to other? Yeah, that can be done too. Assume I need to change container’s host, what will I do. Tar the file system defined for containerised process. Ship the tarball, untar it on the destination and create an isolated and lonely process. Mission environment change successful in manual way. Docker automates it in some other way.

Why?

Given that now we know what container’s are. The next obvious question for any curious learner would be “why? Why we use containers? Why would anyone want to use an isolated process?”. Everyone knows the famous quote, “Necessity is the mother of invention”. What were the necessities which led to invention of containers. To answer these questions we have to take a tour of history.

Computer science is the part and parcel of ever-growing IT industry. In early decade of 21st century, IT industry was growing phenomenally. Monolithic applications were slowly being transformed to decoupled services, there was a huge scale demand, customers wanted quick delivery of applications and the there was a rise of multiple environments. Rise of multiple environment? How could that be an issue? If you were thinking this then answer is pretty simple. You can have a different environment for development, testing and production. This could lead to production issues and would give rise to every developer’s favourite answer i.e “it works on my machine”. Yeah! Of course it may leave you with a red-faced customer too.

What we just discussed now was known as the deployment problem in early 21st century. You have many service stacks, just to naming some of them front end UI, backend DB, background workers, analytics etc. The same time we are having multiple deployment environment say, development server, QA server, production cluster, disaster recovery machines. Now we have any to any mapping between service stacks and deployment environment. This leads to creation of the matrix of hell. If you didn’t understand what I am trying to convey then see the pictures below. It would be quite evident.

The Deployment Problem
The deployment problem Image by: Bret Fisher
The matrix of hell
The matrix of hell. Image by Bret Fisher

The issue was solved when someone noticed that the same problem did exist with the transport industry. How? They had multiple types of goods and multiple modes of transportation. And the same problem of any to any mapping. How they solved the issue was with Containers. The images below will make it clear.

Deployment problem in shipping industry.
Deployment problem in shipping industry. Image by: Bret Fisher
How containers solved shipping problem
How containers solved shipping problem. Image by: Bret Fisher

If you were fond of images in your high school history books then you must have seen images like this below. You see the way how goods were transported before and how it is transported now. Yeah, transportation business use containers. But their containers is not the same as ours. However for analogy purposes you can think them as same.

How containers transformed shipping industry
How containers transformed shipping industry

If you are wondering how containers are solving the deployment problem. Then, I would say you already know it if you understood what is container. OK!! In case you are unable to guess it. I will make things clear for you. Container is an isolated and lonely process with a well-defined file system. From now on if I say container is running then it means an isolated lonely process with well-defined resource access is running on a host system. Basically the root of container’s file system won’t be / of the host(most of the times however it can be) but some other path like /path/to/container_file_system on host. Only the folders and sub-folder inside the path would be accessible to the container. Let me take an example and explain you.

Now assume host1 where a container X, process is running and file system access is /host/container/file_system. It is obvious that the binaries and dependencies of containerised process X remains in the file system defined /host/container/file_system. Now same container has to be run in target machine host2. You tar /host/container/file_system and copy it to host2Untar it. Run the isolate process on host2 with untarred file system access. It should work. Simple!!

How containers solved shipping issues for applications
How containers solved shipping issues for applications
The matrix of hell solved
The matrix of hell solved

How?

Wait for it. Another blog is coming soon. Digest this much info. I hope this blog might have changed your perception towards container now. In next blog we will create a container all by ourself. Until then Hasta la vista.

References:

[1] Introduction to Containers for Local Dev With Mura. Talk by Bret Fisher.

[2]Building Containers in Pure Bash and C. Talk by Jessie Frazelle https://containersummit.io

[3] https://www.docker.com/

[4] https://linuxcontainers.org/