Contact Us | Request Support | Monitoring Portal | Customer Portal | *

1-650-964-9100

  • Home
  • What is Cloud Computing?
  • Services
    • PrimaCloud Enterprise Cloud Computing
      • Features & Benefits
      • Component Services
      • Virtual Private Data Centers
      • Performance
      • Reliability
      • Security
    • PrimaSys Managed Private Cloud Deployments
      • Choosing Private Cloud
      • Implementation
      • PrimaSys Case Studies
    • PrimaCare Operations-as-a-Service
      • OaaS Detailed Description
      • OaaS Plan Comparison
      • Professional Services
      • Highly Available Cloud Cpanel
    • PrimaView Enterprise Grade Remote Monitoring
      • PrimaView Features
      • PrimaView NimSoft Professional Services
    • Frequently Asked Questions
  • Who You Are
    • Growing Enterprise
    • Start-Up Company or Entrepreneur
    • Colocation or Cloud Computing Customer
    • Shared Hosting or Virtual Private Server User
    • Hosting or Managed Service Provider
    • IT Operations Manager
  • Why Choose ENKI
    • Comparing Cloud Options
    • Case Studies
      • Media Rights Management Company
      • Web Design and Hosting Company
      • Political Web Services Company
      • Media File Sharing Start-Up
      • Financial Services Company
      • Online Gaming Company
      • Internet Advertising Company
      • Hedge Fund
    • Key Benefits
    • Videos & Downloads
    • Buying from ENKI
    • Promotions
    • Testimonials
  • About ENKI
    • The Enki Way
    • Management
    • Partners
    • News
    • Investor Relations
    • Legal
    • Service Level Metrics
  • Enki Blog
Enki Blog

Managed Cloud Blog

  • Home
  • Feed
Nov 04
2008

The role of remote monitoring in Cloud Computing

Posted by: Eric Novikoff

Tagged in: Untagged 

Tonight as I watched the election returns on CNN, I wasn't sure what was more amazing: Obama's historic victory, or CNN's technology.  CNN was able to display up-to-the minute results of each state's elections simply at the touch of their news anchor onto the screen of an election-reporting system.  The anchor could touch a state, then touch a metric, such as various demographics, and instantly cut the election results up by age, exit poll answers, or racial composition.  It blew me away.

But it also reminded me of trying to manage a complex set of application deployments into the Cloud - a virtual private data center.    When you take into account that a reasonably complex multi-tier application with significant load can consume tens of virtual servers, all of which need to function successfully in a coordinated ballet, you realize you need the kind of information and analytic capabilities that CNN has available, just to tune it and keep it working.  Because of this, we've invested in an amazing remote monitoring package, NimBus, which we provide as a service to our hosted customers as well as other customers.  NimBus allows measurement of pretty much any parameter inside a virtual server or the applications running inside of it, from simple (but important) aspects such as CPU or memory utilization, to more complex metrics like database queries per second, slow query count, or pages served per unit time from a web server.  In addition, NimBus can perform user-experience validation by running synthetic (fake) transactions against an application and reporting what the user experiences in terms of response time and page correctness.   All of this is summarized on a customizable dashboard, much like CNN's election status screen (click to enlarge):

monitoringdashboards

So, armed with this information - and hopefully not overwhelmed with too much information - we (or our customers) can tune and adjust their applications for appropriate cost/performance tradeoffs or diagnose performance or efficiency issues.    It has produced great results for the customers who implemented remote monitoring, improving their application response time and uptime, as well as reducing costs.

However, the road hasn't been easy.  The Cloud, by its very nature, is constantly in flux, mutable.   This presents a contradiction in goals to an organization:  to optimize something, it needs to be stable so you can measure it and make changes; yet to get the best economies out of the cloud, you need your infrastructure to be elastic, scaling on demand.  Because servers can come and go, and IP addresses can change, setting up a monitoring system and keeping it running isn't easy.  How can you monitor Apache server #2 if it is only instantiated when the web site's load is too high for one Apache?   Luckily, most of our clients' deployments don't change radically over the short term, so the monitoring package can be set up and continue to run for quite a while before it needs reconfiguration.  However, for very elastic loads, you need to either observe the results of your cloud deployment instead of its internals (such as by snooping on its communications with customers) or have your automatic instance deployments also request on-demand monitoring.

Once you add monitoring to your cloud deployment, you can start to take advantage of the powerful capabilities of Total Quality Management, a management philosophy popularized by W. Edwards Deming.  A core principle of TQM is CPI or continuous process improvement, summarized with the following chart:

  tqmchart

TQM says you want to set goals for your process (in this case your software deployment), then you want to run the process (deploy the software), measure the results against the goals, and adjust the settings based on the goals to control the process to produce the desired results (typically a satisfy SLA in the software deployment world.)  However, the real power comes when you report on the results of this process and then use it to take another look at your goals.  The result is continuous improvements in "quality" - in other words, in your ability to deliver the results of your process successfully.

This is how we use monitoring to get the most out of Cloud deployments.

But then I had this insight: why do us - humans - have to be in the loop at all with respect to acting on the monitoring?   Naturally, if the monitoring detects some sort of application or hardware failure, humans need to get involved.  But are humans really necessary for maintaining SLAs?   In today's cloud deployments, especially with systems like Amazon's EC2, the users' application is responsible for both measuring and taking action on application performance issues.   This complicates deployment and coding, as well as tying your application to a particular cloud provider.  However, I believe that the next generation of cloud deployment frameworks will be able to do this automatically, by integrating general-purpose monitoring applications with policy-based cloud management engines.   At ENKI, using our monitoring services, we are already able to automate some of this policy-based management without the need for the application to be aware of the details of this process.   However, a quick caution is in order: if the application isn't designed from the ground up to be elastic (for example, to have new web servers added dynamically) then all the automation in the world won't allow it to participate in automated SLA assurance.

Comment (0)
Share to Facebook Share to Twitter Stumble It Share to Reddit Share to Delicious Share to Google Buzz 
Social Widgets Ultimate Edition - Copyright © 2010 by Turnkeye.com

Free Cloud Buyer's Guide

Our informative guide is full of best practices to help you choose the right Cloud vendor for your business and to make your cloud application deployment successful.

Download Now

Latest Blog Entries

  • Going beyond compliance: achieving true security in the Cloud
  • The Straight Dope About Cloud Downtime and the Myth of Perfection
  • The two basic types of cloud architecture
  • Why overallocation makes cloud computing services impossible to compare
  • Does Cloud Computing Drive Vendor Lock-in?
  • Is Amazon "all that?"
  • Report From VMWorld: is the cloud industry getting ahead of itself?
  • Is Cloud Hype Beneficial?
Business Strategy Case Studies Cloud 101 Cloud Industry Cloud Usage Commentary ENKI Information Events First Person Infrastructure News Philosophy Pricing Techniques Technology

Blog Archive

  • March 2012(2)
  • February 2012(2)
  • January 2012(1)
  • September 2011(2)
  • August 2011(2)
  • May 2011(3)
  • April 2011(4)
  • March 2011(1)
  • February 2011(2)
  • January 2011(5)
  • October 2010(1)
  • September 2010(5)
  • August 2010(2)
  • June 2010(1)
  • May 2010(1)
  • April 2010(1)
  • March 2010(1)
  • February 2010(1)
  • January 2010(1)
  • October 2009(2)
  • September 2009(7)
  • August 2009(3)
  • July 2009(3)
  • June 2009(6)
  • May 2009(2)
  • April 2009(4)
  • March 2009(2)
  • February 2009(1)
  • January 2009(1)
  • November 2008(1)
  • October 2008(2)
  • August 2008(4)
  • July 2008(2)
  • June 2008(1)
  • May 2008(1)
  • April 2008(1)
  • February 2008(3)
  • January 2008(3)
  • December 2007(2)
  • November 2007(1)
  • September 2007(1)
  • August 2007(3)
  • June 2007(1)
  • May 2007(1)
  • March 2007(1)
  • February 2007(4)
  • January 2007(3)
OVERVIEW
  • About PrimaCloud
  • About PrimaCare
  • Key Benefits
  • Comparing Cloud Options
HELP CENTER
  • Frequently Asked Questions
  • Contact Us For Support
  • Terms and Conditions
SELF SERVICE PORTALS
  • PrimaCloud
  • Monitoring
  • Customer Portal
  • Discount Domains & Certificates
Follow @enkicloud
LOGO_CoFounderWebsite
Copyright © 2011 ENKI LLC