Tuesday, June 1, 2010

Challenges & Risks of Implementing Cloud Computing


(Updated 10/25/2011)
After writing about the benefits of cloud computing, I mentioned that it’s not without some risk and downsides. Cloud computing presents a strong case for cost savings, new capabilities, flexibility and speed, but to do a proper return on investment (ROI) analysis, you need to evaluate the risks and costs associated with implementing cloud computing for your organization. This article describes risks and disadvantages of moving to “the cloud,” how to mitigate those risks, and identifies issues relevant to your provider choice and implementation plans. In addition, there are topics that may not qualify as risks or disadvantages of cloud computing, but that may require an assessment of the impact on your organization. An evaluation of the risks and impact is an important precursor to decision-making regarding whether to make the move to “the cloud” and who you select to provide ”cloud” services. In fact, thinking about and addressing these issues should be part of the planning process for any deployment (cloud or not). Critical components of such planning should include technology, personnel, business process and company culture. Moving to “the cloud” requires not only considerable planning, but possible unexpected financial investment.


If you are a new organization looking to use the cloud as part of your start-up, some of the risks may not apply, but deploying to the cloud may not be as simple as it seems. It is clearly different than past IT deployments. If you are like most IT pros, you are well versed in planning, building, and deploying to a local data center. Just because cloud computing does not include some of these components doesn’t mean your job is leaps and bounds easier.

Don’t assume that you can reduce costs for every bit of infrastructure and people hours you save in cloud computing. Sure, you should expect to see an overall cost savings, but there are other areas where you may need to invest in order to successfully implement cloud computing. There’s savings in cloud computing but it could come with costs in other areas.

I’ve tried to break up this article into the major areas of risk and implementation issues that you should consider. Most writers break topics like this up into smaller articles but in my opinion this frustrates many readers. Rather than ask you to read about 10 different articles, I’ll publish it as a single foundational topic and then refer to this article in later postings.

While you consider the various factors below in evaluating and comparing cloud computing with locally implemented solution, be fair in your comparison. For example, many evaluators are pretty tough on security (and they should be), but their measures or requirements should not far exceed what they currently have in their own data center. Pit your own organization’s capability up against a possible cloud provider’s ability.

Business Continuity



The term “business continuity” is a fancy way of saying that the application will be available to run your business on a day to day basis and mitigates factors that mean to do otherwise. The telephone was the 19th century business critical system. The telephone has been replaced by the network and e-business computing systems that run today's companies.

System backups are one of the first things that people think of when they hear this term. Daily backups mitigate the effects of system crashes or errant programs as well as hackers that destroy data that you need to run your business. But backups are “old school”. In today’s business, some IT systems absolutely require higher availability than just allowing a restore from backups on failure. Restoration from backups are now becoming considered a last resort. The time it takes to restore critical systems is just too long. E-commerce and other business critical systems rely on redundant solutions that have automatic fail-over should any one of the components fail. Like a commercial airliner with redundant systems - engines, electrical, pumps, radios, pilots and more - high-availability IT systems have redundant power, cooling, web servers, database servers, networking equipment, network routes and other services so that if any one fails, another can take over. Some business critical systems may have redundant facilities in order to mitigate any possible issue with the facility itself (earthquake, hurricanes, and twisters to name a few). Real-time replication of data to fail-over sites are continuous streams of data to redundant systems, ready to bring you out of disaster within minutes.

When large grocery supermarket chains became popular, health officials saw a decline of bad food and disease caused by unclean conditions and sub-optimal suppliers. One of the reasons is in part, the impact it would have on the larger supermarket chain organization as a whole. Bad publicity of a single store could affect a multi-billion dollar, nation-wide supermarket chain. Therefore the supermarket chains put a lot of focus on clean stores and clean suppliers. This analogy can be applied to business continuity provided by cloud computing vendors. If one customer of a large cloud computing provider looses business continuity due to the provider’s platform, it puts at risk the entire cloud company organization. Because of this risk, and due to the efficiencies of scale, we are likely to see up-time numbers that are economically infeasible for most business to achieve on their own.

There is always the risk that your cloud computing vendor could go out of business, and if that vendor provides a critical service it can deeply affect your business continuity. Depending on the circumstances, this could be quick and swift action by the banks, locking doors and repossessing all of the equipment used to host your application or it could be a longer process, giving you time to switch vendors. In either case, evaluate various models that might protect your data and business continuity. At a minimum, you should evaluate the vendor you are going into business with and continually monitor the financial health and risk of that vendor for the life of your partnership.

Like any software vendor, a cloud provider may decide to stop offering a service that you use. The one difference here is that if vendor decides to stop offering their software for sale, you can normally continue using your software but the support and upgrades may dry up. With a cloud provided service, the service is just gone. Who you partner with to provide your cloud services and your contract will drive how big this risk is.

When evaluating cloud computing options, you should find out how your prospective vendors provide business continuity under various circumstances. Below is a list of items you should ask about or look for in their informational package related to continuity.
  • How the vendor mitigates each of the risks of attributed to availability

  • Service availability statistics - in-house and reports from other customers

  • Systems monitoring and health-check processes (how granular do they monitor)

  • Frequency of regularly scheduled tests of redundant systems and fail-over procedures.

  • Frequency of failure recovery drills to validate those processes

  • Does the vendor conduct root cause investigation on all critical failure

  • Client accessibility to status boards and other methods to monitor services.

  • Contract options regarding penalties for down time and other sub-optimal service levels.

  • Design features of applications that insulate the client from down-time due to various maintenance activities.

  • Vendor procedures that support restoration of user level data.
In addition, you should be comfortable with your vendor’s capacity planning process. You won’t be their only customer, therefore, find out how they monitor and plan their capacity needs for network, storage and computing. How quickly can they increase capacity to meet spikes in demand due to new business and seasonality. Some vendors can offer guaranteed service levels by insulating your usage from others so you don’t suffer compute brownout during your busiest time of the year.

Identify a vendor who is willing and prepared to talk about the topics above, discuss past issues and describe how they have addressed them. This indicates a level of maturity and confidence.

What happens when you can’t pay the bill?

Your data and IT capability is a company asset and is part of its valuation. Moving it outside company walls could present a significant risk should your company meet with financial trouble. You should understand what will happen to your data if your company is unable to pay your cloud provider bills. If your company goes into bankruptcy or is sold after not paying the bills it could be disastrous. It is critical that this intellectual property is not lost, and that the new owner can access it if the company is to be resurrected after bankruptcy. Although we don’t typically think about this, as an IT professional you need to ensure that your company assets are covered in such a scenario. Speak with your cloud provider to ensure that the contract clearly states what happens to your data and capability. At a minimum the data should be saved off in a way that is usable. You may need to have an agreement similar to a cleaning deposit on a rental property, but instead of cleaning, you pay up-front to have your data archived off in a way that it is usable and can be restored to service if you default on your contract.

User Data

When a user leaves a company, the user’s account is normally disabled or removed completely, but that person’s user level data (documents and other info) is not normally automatically deleted, and if it is, it’s probably stored on some sort of backup media. When the data is in the cloud, this may not be the case. Find out what happens to user level data (if there is any) when you remove user accounts that are no longer needed. If valuable user level data exists, ensure that there is a process to support the assignment of that data to another user or capture that information in another way.

Security

Security is a huge area of concern for most companies that host their own services and is one of the top reasons businesses shy away from third party hosted applications and services. A lot of this apprehension is due to trust and control because, once a business moves it’s data outside the corporate data-center walls, it’s no longer in the hands of company employees. Hackers try to gain access directly to business systems as well as attempting to compromise employee computers in order to gain access to business data.

One of the things that is a constant for most cloud computing platforms is that the services they provide are available to any Internet connected computer. In many cases this is an advantage, but to companies accustomed to the security of private networks, this is scary. Most companies host very private information systems on company networks that are available only to employees. There is a certain sense of safety and security that a private network has as the first line of defense. A company can control who has access to the company network and then who can access the private data. The trust in this first line of security may be somewhat misplaced, but may feel safer than being connected directly to the Internet. When using cloud computing, a company’s data is hosted within the same facilities and may be “co-located” on servers with data of other companies.

When evaluating the security of a cloud vendor, compare it with your own security practices and capabilities. You may find that, like the supermarket analogy above, the measures and capabilities a cloud provider has taken exceed anything feasible for your own company. Don’t take it on faith though. Do some digging. Some questions that you might want to address with the vendor are:
  • What measures does the vendor take regarding network and system security?

  • If a system is compromised, how contained is it?

  • How many resources does the vendor have in the security area or level of funding as a percentage of total operations labor cost.

  • What outside firms have evaluated their security and what was grade were they given and how often are they evaluated?

  • What vendor personnel have access to your data and what is the process for granting and removing access to those individuals?

    • Is vendor employee access extended or only granted briefly to support maintenance activities?

    • Do vendor employees that have access to your data receive background checks?

    • How does the vendor block their employees from downloading data to portable media (like USB drives)?
  • What sort of security logging and monitoring is in place?

    • What access do you have to access logs?

    • What other logs are kept, what is in those logs and who has access to them?
  • How are authentication credentials protected (where are password stored, etc)?

  • Does the vendor offer two-factor authentication, if so do they offer hardware devices to support this?
  • What is the policy for applying software security patches?
  • How is access controlled to the core data of the application?
  • Is network security used to isolate and control flow as well as encrypt data between certain parts of the solution system, or are back-end networks wide open?
  • Is data stored in an encrypted form? If so, what are the crypto algorithms used and where are the keys stored?
  • Is your company data stored with that of other companies or is it separated some way?
  • How does the vendor protect backup media and what do they do with old or failed storage devices like disk drives?
Security is becoming so complex that entire businesses have been created to help you evaluate the intricate details of security implementation. One thing you might look for are any standards that the cloud provider complies with, like ISO 27001. Make sure you check to see how that certification was obtained and the scope.

Legal Implications

Data Ownership

Several legal issues may come into play for those using cloud computing. Your evaluation of the vendor should cover who owns the data that is generated and stored on the vendor systems.

Privacy

Privacy for employees and customers is important as well. Ensure that the cloud provider can adhere to your privacy policies as it pertains to employees, and your customers. If you have employees or customers in other countries, you need to ensure that those policies align to any country-specific rules as well. One example of a country-specific rule is that although a company may keep access logs for security reasons, those logs can’t be made available to managers for purposes of evaluating individual worker hours.

Servicing Legal Searches

Moving data and systems outside your company walls may expose it to easier access by outside lawyers because the vendor doesn’t have as much at stake (except their reputation). Know the procedures that your vendor must follow if there is an investigation into your company. For example, will the vendor just roll over to a law inquiry, or will they turn the request over to your company lawyers and only grant access after you have authorized it?

Supporting Legal Discovery

Find out what capabilities the cloud service has that allows you to put data on some sort of legal hold so that it’s not deleted or changed in during pending legal actions (whatever they may be). You may not have direct access to the database where data is kept but there should be a way for you to extract data in a way that is usable, and or allows you to quarantine the data.

SOX Compliance

There may also be SOX compliance related issues that need to be considered during your implementation. How they might apply depends on your implementation.

Location

Know the physical location of your data. One of the things that IT pros need to know when moving their compnay into the cloud is the physical location. Physical location might impact the legal & regulatory aspects of moving to the cloud. More specifically, what country does the server reside in? Although I don't have all the answers regarding the specific legal and regulatory aspects that having your data in various countries may have, it's reasonable to postulate that there is most certainly risk here. For example, the US has very specific laws regarding search and seizure of computers and the information they hold, but these laws no longer apply when your data resides outside the US. I won't name countries here but many governments wouldn't hesitate to march into a data center, confiscate equipment and ask questions later. The point should be clear, understand where your data is kept and, if not in the US, understand the legal as well as business continuity risks. If you are a multi-national company it becomes more complex.

UPDATE (2/3/11): A separate article, titled "Do Govt. Laws Present Undue Risk To Cloud Computing?" was published today that deals with the implications of cloud computing and legal searchs.

Disclaimer: The author is not a lawyer, nor is he attempting to give you legal advice. :)

Interoperability

Your need to integrate with other IT systems to support your business doesn’t go away because you are using cloud computing. In fact, it may be more difficult to integrate with other systems when you use cloud computing. Sales systems need to integrate with accounting systems, fulfillment systems need to integrate to supply-chain systems, etc.

Before you move a portion of your IT system to a cloud provider, you will need to design and plan ho you are going to integrate with other systems. Find out what your potential cloud vendors can supports in the way of integration with your current systems. It would also be prudent to find out how a cloud provider will support you if you have a sudden future need to share data or integrate.

Upgrades

Normally when people hear the word “upgrade”, it’s a good thing, but what if you don’t like the capabilities the cloud provider is adding or changing? Sometimes you don’t have a choice and a provider may remove features that you need. One of the ways cloud providers lower costs is by deploying the same solution to all its customers. If you have a locally hosted solution, you can skip the upgrades if you don’t want them, and in some cases, you can modify new releases to fit your needs. You may need more time before an upgrade so that any integrated applications can be upgraded to maintain compatibility. Look into how the vendor release upgrades. Some providers allow you to turn off new features or change the way they behave.

Modifications

Sometimes going to the cloud hampers your ability to customize an application to your needs. Many companies are able to modify the software applications they buy, or they build special functionality onto those applications needed by the business. It also means companies might have the ability to fix bugs while waiting for the next release from the vendor. There is a real cost to modifying off-the-shelf applications and or building and supporting add-ons, and a company would be remiss to do so without the expectation of a real ROI. Look at the capabilities the provider offers in this area and then do a ROI analysis to see what is right for you. Some cloud providers will have the flexibility to adapt their service to your needs. For example, Google apps allows administrators to turn-off certain features in their office productivity suite, and Google Analytics has APIs so that you can extend its capability.

Vendor Lock-in

Although vendor lock-in exists to some degree with any software package (try switching from SAP to Oracle accounting), the cloud could make this worse. Vendor lock-in is really manifested in the fact that it’s difficult to change from one packaged solution to another. If the package is installed on your computers where you have complete and unfettered access to the data, migrating to a different solution is almost always doable. Depending on the cloud provider, you may find it more difficult to get the level of access you need. As cloud computing grows up, there is a growing demand on vendors to be more open and make it easier for customers to migrate to other solutions. This is a win-win situation because offering this capability lowers barriers for customers to move into the cloud (potentially speeding up adoption) by lowering customer risk. Evaluate the openness and ability to access your data as it pertains to vendor lock-in when choosing a provider.

Archive Data

When you build and deploy an application in-house, you decide how much data is kept for later analysis or other uses, with no time limits on duration. This may not be the case with a cloud provider and they may have limits in time or space used. You also might find that due to the cloud providers scale, data could be held indefinitely. This sounds good, but in some cases companies prefer to remove certain data, they no longer need, for legal or other reasons. You may not be able to restore archived data to a system that understands how to use it, however, due to version differences in the application that created it. You will need to ensure that the archive and retention policies of your cloud service align with your own policies and you may want archives that are usable as a complete record.

WAN Dependence

By adopting cloud computing you now have a much greater dependence on your wide area network (WAN) connection to the Internet. Although many companies consider their Internet connection a business critical part of their infrastructure, some may not. By moving to cloud computing, WAN connectivity becomes a very critical piece. When the WAN link dies, it’s the same as having a total system failure in a local data center. Cloud computing implementers may need to view their WAN link in a whole new way. In some companies it may have gone from nice to have to absolutely critical. To mitigate this risk as best you can, evaluate your connectivity partners. You now need to understand every link in the chain all the way to your cloud provider. You need to revisit your Internet provider’s connectivity and take a look at the entire end-to-end service. Here are a few things that you should look at.
  • Revisit your network provider service level agreement.
  • Reevaluate your WAN capacity needs. Now that you are moving your computing off site, expect your WAN traffic to grow. Plan for this.

  • Understand what level of redundancy does your provider have within their infrastructure (like power, cooling, network routers, etc).

  • Does your provider have redundant connections to the Internet?

  • What on-ramp does your provider use - a direct connection to one of the major back-bones or another route?

  • Each router your data travels through is a potential failure point.

    • Evaluate the route that data will take from your worker’s PC to the final destination.

    • Find out how many “hops” your data has to travel through in order to reach your cloud vendor’s service.

    • Include your own networks in this analysis.

  • Evaluate the need to use a separate Internet service provider as a backup in case your primary network fails and consider at load balancing the traffic to your provider.
Understand how moving to the cloud affects access by your remote workers? Many companies have employees and key partners located world-wide. This is accelerated by a teleworker workforce. You should ensure that your teleworkers and partners are not adversely affected by the move of your application to cloud computing.

People

Staffing

The first thing that people think is that all IT staffing goes away but it not the case. You still need people in IT to integrate, manage your vendors, etc. So how do you staff for Cloud Computing? One of the obvious things to look at is what’s changing, where are the dependencies of your adjusted / new architecture. An obvious dependence is wide area network (WAN) connection to the cloud provider. Therefore, evaluate if you have the right number of network engineers?

IT Staff Unrest

We talk a lot about technology and some really geeky stuff, but people are part of the equation at every intersection in a well run IT organization. Juan Carlos Perez’s article in the New York Times about IT worker unrest is right on the mark. When moving to the cloud, IT workers feel like their jobs may be at risk. Make sure you keep the lines of communication open and honest.

Training Plans

Look at what’s changing as well as your future direction, then think about the skills you need and the training that will unlock those capabilities to keep your staff current and sharp. If you are using a platform as a service, like Google’s App Engine, your staff will likely need time for training and a plan to start small and build on your success and failures as you grow your internal intellectual capital.

End Users

Users are people too. Going to the cloud can bring a lot of change for your end users. Look at how your user's application experience will change as well as how they will request and receive services. Step through the entire life cycle and business process with the end user in mind. Once you understand the impact you need to communicate broadly. Listen to your end users concerns and look for opportunities to enhance their experience around things like provisioning, use, and support. Involve users early in the process. You may be releasing exciting new capabilities to your users. To truly reap the benefits you need to prepare and educate the workforce.

Don’t forget to look at how you support your end users. It may be a little confusing for your users. Successful cloud vendors seem to have good user support products that include help pages, forums, and knowledge bases. If these don’t solve your issue, you are now relegated to the support process defined by the vendor.

Business Culture Implications

Be aware that business culture in your company can be a resistive force to benefiting from some of the new capabilities that cloud computing brings. One example is collaboration. You may expect huge benefits from open and friction free collaboration between workers or even partners but you should look at how receptive your business might be. This is more significant than you might think.

As an example, when introducing open collaboration in WIKIs within my company, I found it very difficult to get people to collaborate. This is because it was very counter to the current culture and work habits. Workers, until this point, were expected to provide feedback to the author via e-mail after reviewing content. Because there was a sense of content ownership, others were reluctant to make changes to what the author had written even after they were told to “change at will”. It took a lot of time to convince people that it was OK to modify content authored by others. I also found that many users were not comfortable with publishing content widely. The culture in many areas of the business was to work in contained silos. One of the larger benefits of WIKIs in my company took a much longer time to realize than expected. Don’t always assume that your company will just jump on the band wagon of new features, capabilities, or ways of doing business.

Development Needs

Just because you moved your business application to the cloud doesn’t necessarily mean that your development days are over. You may still have the need to integrate the capabilities of your cloud hosted application with other business systems. Cloud computing could be a move to “platform as a service” where you reap the benefits from easily scalable infrastructure similar to those from Amazon and Google.

Business Process

Understand what business processes change as a result of your shift to cloud computing. There isn’t always a one-to-one match in capabilities offered by cloud computing vendors to your local application. Understand your business process and work with the process owners and leaders to ensure your cloud computing implementation will be a success today and fit your future strategic direction.

The Cloud Is An Infant

Cloud computing is in its infancy. Providers are still trying to figure out how to properly build, scale, and market their services. Expect a lot of change to come as cloud providers try to find the best way to deliver their services. This instability equates to risk. The IT industry is constantly going through change but this is a bit more significant though I can’t quantify it any better than that. This statement begins to show the article’s age but as of 10/25/2011, we are still trying to figure out things like security, design for redundancy, etc.

Have an "Exit Plan"

Although this is sort of IT 101, build an idea of what it looks like to shut down an application that you either built or are using as a service.  This should include things like migration of your data, and how you scrub important business data from the vendor's servers.  This article by Paul Venezia of InfoWorld explains the implications of exiting and how one IT group ensures their data has been removed from a providers servers.  I don't think the author goes far enough because there are most likely backups of your data on remote servers and possibly tape.  You should already know what your provider's retention policy is and what form these backups are kept.  If you have stand-by servers in another data center, you may want to perform a wipe on those disks as well before the storage space is released to other customers.  I'll assume that the backups on tape will eventually be destroyed and if you've actually implemented with this provider, that you are OK with their security policies that control access to backup media and their disposal procedures. 

Closing

There are a lot of benefits to cloud computing and the concept of software as a service, but it’s not a slam dunk. To make the right decision for your company you need to fill out both sides of the balance sheet to know if you can really achieve the potential ROI of cloud computing. Hopefully this article helps you take precautionary measures that may reduce your risks as cloud computing grows up.

See Also

Some of my previous articles on cloud computing.


- Chris Claborne

No comments:

Post a Comment