Wednesday, June 2, 2010

Slightly Skeptical View on Solaris Zones

Zones are a light weight VM concept which is further development of the idea of BSD jails which were added to FreeBSD in 1999.  Zones were designed in Sun by Andrew Tucker and according to Sun have better security and better integrated into the OS.  To say that zones are great would be an understatement. They completely changed Unix landscape (including Unix security landscape) and this why Solaris in the first true XXI century Unix available on the marketplace.  It is not an accident that AIX 6 copied the concept from Solaris 10: imitation is the highest form of flattery...
The idea of zone is to creates an isolated process tree. Processes inside the zone cannot affect processes outside. Thus, we get an environment similar to a virtual machine, but with minimal overhead. It is  usually called a lightweight virtual machine. Unlike complete virtual machine environment like VMware or AIX 5.3 LPAR, zones are focused mainly on security. It is important to stress that they have the smallest overhead among all mainstream virtualization technologies and they have a clean and simple design. Unlike LPAR in AIX ("full-size, VM/360 style virtual machine implementation), zones can be used both on Intel and SPARC versions of Solaris 10.  Unlike VMware you have one instance of OS (I always wondered what's so great in  running ten instances of OS virtual page management on the same hardware and pay EMC additional $5K for this privilege -- IBM used to avoid this problem in VM/CMS factoring virtual memory management into VM level).  The same is partially true about schedulers.  In a very deep way full virtualization solutions cannot compete with light weight virtualization unless they use "minimized OSes" in which all "extras" are factored out to the VM level.
It seems that zones are becoming the new powerful security model.  Instead of one computer per server, one computer could have multiple jails for applications provided by zones, with each zone providing one service.  This is especially attractive for large enterprises where "fight for privileges" between users and administrators is especially acute. Not it can be resolved by granting root access to the zone with a particular application. That's huge advance over mess that used to exist.
There the most important feature of zones is that this method of isolating applications from each other and from "mothership". It can be used as new, natural and powerful security paradigm for all but the most convoluted applications (I would not recommend running Oracle in a zone if you still have some hairs on your head; at least not right now ;-).
If a service in the zone is compromised, the activities of the attacker will be constrained to the zone, but also will be fully visible to the administrator, at minimal risk to the administrator. This model offers substantially enhanced monitoring in comparison with separate hardware devices like network IDS,  or full virtual machines (like AIX LPAR). The latter offers little reliable insight into their operation once compromised. In zoned environment global zone can be a perfect point to watch over zones. Also constraints on system calls greatly hamper the ability of  the attacker in employing rootkits.
Zones benefited from approximately five years of experience with FreeBSD jail technology (as I mentioned above jails were added to FreeBSD in 1999) and managed to move further along the path pioneered by FreeBSD.  Solaris 10 allow separate resource allocation for each zone (See Solaris Containers-Resource Management and Solaris Zones).
Recently Sun extended the concept of a zone into more sophisticated mechanism implemented a "linux zone" which can run linux executables.
Sun terminology is confusing and often it is unclear. In one place they use the term "zone" and in the other the term "container".  I tend to think that zones + resource management = Solaris containers
zones + resource management = Solaris containers
There is also analogy between zone and Java sandbox concept.  Each zone requires its own dedicated IP address and, using Solaris cinematographic analogy,  represents an isolated satellite revolving around the unknown planet that can communicate with other zones and "mothership" only via network services.
The number of zones that can be effectively hosted on a single system is dependent upon the total resource requirements of the application software running in all of the zones. Each zone does duplicates certain daemons (cron, syslog,etc), so there is an overhead.
A minimalist zone needs approximately 50Meg of disk and 15Meg of memory. Sun recommends 100M disk space for a zone as a minimum.  If each zone does not do a lot of processing or do a very similar processing (synergy like in case of multiple WEB servers) is it probably possible to host a couple of dozens of  WEB servers on a typical V210 configuration with 2 CPUs and 4G of RAM.
The problem with zones is not only that they add complexity, but that people often want from light-weight VM capabilities of full VM (hardware hypervisor). And to withstand this barrage of customer requirements is pretty difficult. As a result zones became a complex expensive kludge and the line delineating zones and full scale virtual machine becomes pretty fuzzy.
Currently Sun is experiencing the period of "irrational exuberance" with zones: instead of just polishing the offering and clearly identifying its limitations the developers are trying to extend it in all directions.  Some directions are (technically) interesting like Linux zones in recent Solaris 10 x86 (zones that are able to run unmodified Linux binaries), some are questionable like access to raw devices in the zones to run Oracle databases, but all of them are adding complexity and it is not clear what is the real return on the investment.
For  example, if a person wants to run unmodified Linux binaries (and this is a workstation problem mainly), in most cases (unless you are running chip tracing software or other binary with huge CPU requirements) he/she should be able to use a SunPCi card  to solve the problem.  I do not understand why not to make SunPCi card to work on Intel boxes and use this solution for those few cases when you have no other solution but to run Linux binaries until a native Solaris solution emerge. What exactly prevents this ? In an extremely rare case you want raw power then it should be SunPCIi with high level Opteron. In this case you main application can be isolated from the rest of the system and it also can be a Windows or Apple application not just Linux, which is probably more practically important case.
I hope that this "everything is possible" activity will stop or at least slow down in late 2006 when Sun will get the feedback about the rate of zones adoption in the industry (I bet it is slow and it additionally slowed by the problems with the initial implementation and all the new features that Sun is adding to the plate).  When everything is possible nothing is easy...
As a zone is a light-weight VM created within a single instance of the Solaris Operating System, you can boot zone, login into zone, etc as if this is a separate computer. The original instance of Solaris ("mother ship") is called a global zone. It always has the name global. The global zone run system-wide processes and is used for zone administrative control. A regular user of the global zone can be a root user of the zone and thus can boot the zone, add/delete users, etc. that's a nice separation of duties in a large enterprise environment.   Here is the summary or local/global zone features:
Global zone
  • Is assigned ID 0 by the system
  • Provides the single instance of the Solaris kernel that is bootable and running on the system
  • Contains a complete installation of the Solaris system software packages
  • Can contain additional software packages or additional software, directories, files, and other data not installed through packages
  • Provides a complete and consistent product database that contains information on all software components installed in the global zone
  • Holds configuration information specific to the global zone only, such as the global zone host name and file system table
  • Is the only zone that is aware of all devices and all file systems
  • Is the only zone with knowledge of non-global zone existence and configuration
  • Is the only zone from which a non-global zone can be configured, installed, managed, or uninstalled
Local zones
  • Is assigned a zone ID by the system when the zone is booted
  • Shares operation under the Solaris kernel booted from the global zone
  • Contains only a subset of the complete Solaris Operating System software packages
  • Contains Solaris software packages shared from the global zone
  • Can contain additional installed software packages, not shared from the global zone
  • Can contain additional software, directories, files, and other data created on the non-global zone that are not installed through packages or shared from the global zone
  • Has a complete and consistent product database that contains information on all software components installed on the zone, whether present on the non-global zone or shared read-only from the global zone
  • Is not aware of the existence of any other zone(s)
  • Cannot install, manage, or uninstall other zones, including itself
  • Has configuration information specific to the non-global zone only, such as the non-global zone host name, IP and file system table
Processes in zones are isolated from other processes: even a process running with superuser credentials in a particular zone cannot view or affect activity in other zones.  A processes that are assigned to different zones are only able to communicate through network APIs. For example to share files between zones NFS can be used.
Each zone is given a portion of the file system hierarchy. Because each zone is confined to its subtree of the file system hierarchy, a workload running in a particular zone cannot access the on-disk data of another workload running in a different zone. Files used by naming services reside within a zone's own root file system view. Thus, naming services in different zones are isolated from one other and the services can be configured differently.
Zones are ideal for hosting applications which can adversely influence each other and provide a possibility to consolidate several such applications on a single server.  They are perfect for hosting providers as they permit adequate level of isolation of clients without excessive and punishing penalty that is difficult to justify in a world of cut-throat competition typical for web hosting.  The fact that Solaris 10 can run of a regular x86 computers (from example PowerEdge 1950 and 2950 from Dell) makes this even more attractive value proposition.
The cost and complexity of managing numerous small servers that host just one application makes it more feasible to consolidate several applications on larger, more scalable servers. A zone also provides an additional abstraction layer.
Each zone has one or several dedicated IP addresses. Zone cannot share IP with the "mothership" (global zone) or other zones.  
The global zone ("good old Unix") has a dual function. It can run process like any normal Unix system, but it can also manage satellite zones.  Each zone is also given a unique numeric identifier similar to UID, which is assigned by  when the zone is booted. The global zone is always has ID 0. Zone names and numeric IDs are discussed in Using the zonecfg Command.
When logged as root the global zone, the administrator can monitor and control the system as a whole. All processes and all files are visible from global zones. That's a very convenient feature which permits advanced debugging of complex applications.
A non-global (sattelite) zone is administered by a zone root user, which is a just a regular user of a global zone.  The "global administrator" ("mothership" root)  can assign the Zone Management profile to any user converting him into the zone admin.  It is important to understand that zZone admin privileges are limited to the zone(s) he administer. In a global zone he is just a regular user.  This is a very nice, very slick way to resolve "root hell" problem typical in large corporation when each application maintainer need root provides to perform its duties and as such encroach on turf of primary server administrators and can negatively affect him and/or other users as he has the privileges to alter any parameter of the system.  See Non-Global Zone Characteristics for more information.
The following figure from Sun documentation shows a system with four zones. Each of the zones apps, users, and work is running a set of applications unrelated to the workloads of the other zones Each zone can provide a customized set of services.
Each zone also has a node name that is completely independent of the zone name. The node name is assigned by the zone admin. For more information, see Non-Global Zone Node Name
For more information about steps involved in creation zone, see Solaris Zone Creation Examples  and man page for zonecfg Command.

Zone State Model

Zone is a light-weight VM and we should keep in mind this fact when navigating our way via obscure terminology. Sun introduced too many states into this concept with somewhat confusing names and semantic (for example, it looks like "installed" and "ready" state are more like "offline" and "online" device states ;-). See the zoneadm(1M) man page that unfortunately does not explain this issue despite the fact that this is the command that is designed for changing VM states.  It looks like a zone can be in one of the following states.
Undefined
--Create--->

<--Delete--
Configured ----Install-->

<--Uninstall---
Installed --Ready-->

<--Halt--
Ready ---Boot--> Running
       
         ^--------------------- Shut down -------------------|
  1. Undefined. This is stage where zone configuration was started but not yet committed to storage or if the zone was deleted.
     
  2. Configured. The zone's configuration is completely specified and committed (written to disk). However, some elements of the zone's application environment (root password, etc) that must be specified for the boot are still missing. 
     
    • To change to the next (instlalled) state:
      zoneadm -z zonename install
    • To change to the previous (undefined) state:      zoneadm -z zonename uninstall
  3. Installed. The zone's configuration is completly configured and VM is ready to boot. The zoneadm command can be is used to verify that the configuration is bootable. 
    • To change to the next ("ready") state: zoneadm -z zonename ready (optional)
      zoneadm
      -z zonename boot
       

    • To change to previous (configured) state:
      zoneadm
      -z zonename uninstall
  4. Ready. Transition to this state from the installed state is essentially a switching on VM (like online button in devices). At the end the virtual platform for the zone is established. The kernel creates the zsched process, network interfaces are plumbed, file systems are mounted, and devices are configured. A unique zone ID is assigned by the system. At this stage, no processes associated with the zone have been started. So normally this is a transitional state toward ready state (see below). But in beta versions of Solaris 10 you need to explicitly change zone into this state to be able to boot it. zoneadm -z zonename ready
    zoneadm halt and system reboot return a zone in the ready state to the installed state.

  5. Running. User processes associated with the zone application environment are running. The zone automatically enters the running state from the ready state as soon as the first user process associated with the application environment (init) is created. zlogin options zonename
    zoneadm -z zonename reboot
    zoneadm -z zonename halt returns ready zone to the installed state.
    zoneadm halt and system reboot return a zone in the running state to the installed state.

  6. Shutting down and down. These states are transitional states that are visible while the zone is being halted. However, a zone that is unable to shut down for any reason will stop in one of these states.
If resource management features are used, it is best to align the boundaries of resource management controls with those of the zones. This alignment creates a more complete model of a virtual machine, where namespace access, security isolation, and resource usage are all controlled.
Post a Comment

You might also like :

Related Posts with Thumbnails