Marathon Banner

Blog Entries in downtime

Sunday, October 3rd, 2010 - 4:14 pm EDT

The Dangers and Risks of the Norms of Availability

Posted by: Rob Ciampa

OK. We’re not being coy, even though we have a big “Major Product Announcement” box on the front page of our web site. Really. Am I going to share? Yes, but not now. Instead, I’m going to provide a bit of a drum roll and some recent feedback from many worldwide discussions with some great analysts and thought leaders.

To start off, we’re not just making a product announcement. Rather, we’re proposing an entirely new way to think about availability: one that will actually work – and work for the masses. We’re not pulling any punches. It builds upon years of experience, 14,000+ implementations, and thousands of customers. Combine that with some stunning, technological, price-performance breakthroughs and the game begins to change. It’s a direct assault on what I’ll call the “norms of availability.”

These norms have forced many organizations either into accepting a false sense of security or tolerating downtime that could have been prevented. We’ll be attacking both this week. So what are some of these norms of availability?

  • Recovery is OK because it’s getting faster. Really? Go ask American Eagle Outfitters about their 8-day outage debacle. Given that over 30% of recoveries don’t go as planned, this is dangerous, especially now because the stakes are so high. “Recovery” is still application downtime and still presents a high-risk of data loss. Remember: prevention is better than recovery.
  • Virtualization is availability. Actually, virtualization is consolidation. Virtual machines (VMs) fail and have to be restarted. Virtualization, though very important, is a subset of availability. Want to gauge availability risk with pure-play virtualization? Look at the other pieces such as storage. No big deal? Most organizations will disagree on many fronts, including price and complexity. Virtualization matters (and we’re fans), but availability should be thought of first.
  • DR keeps us going. DR is the nuclear option. It’s a last resort when major catastrophes occur. It should NOT be used when a disk drive fails. (More on this in a subsequent post.) DR is a necessity, but needs to be combined with local protection to make a very powerful availability combination.
  • We have high availability support for our applications. That used to be a good, though it's really an inherently flawed approach. Remember, it’s still a restart; it’s still complex; and - do the math - it’s still expensive.
  • Cloud computing will solve all our issues. Actually, it may ultimately be a great part of the DR component of broader availability, but it’s just not going to work for localized failures. Time to think holistically.
  • Backups protect us. They sure do, but it may be a day or two late. Can your business afford that? Keep your backups going, but consider other things to amplify availability.
  • I’m not worried because I never had a problem. Wow. Now I’m really scared. You may want to give us a call before you get fired. During recent services work, we’ve helped many of our customers not just identify, but find their critical assets. Do you really know where your important assets are? Do you want to be looking for them when something goes awry?

I’d like to say these are tales of fiction, but they’re not. They’re part of that dangerous norm and we hear this regularly. Fortunately, many organizations are getting better. Shortly, we’ll provide them with availability capabilities that they’ve never had access to, either because of economics, complexity, or scalability.

For the past few months, we’ve been briefing analysts and other experts on what we're delivering this week (and after). What has their response been?

  • Wow. This is BIG.
  • You’re changing the dynamics of availability.
  • You’ve eliminated the triage IT decision make while tackling availability.
  • You’ve emancipated applications from downtime.
  • This is the “American Dream” of computing.

I like the last one, though I think the rest of the world will appreciate it as well. Stay tuned. We’ll share the word in the next couple of days. And it won’t end there because fault tolerance is about to go mainstream and the implications are substantial. And that’s just the start…

Rob Ciampa

Show Discussion / Comments (0)
Availability  Announcements  Continuous Availability  Disaster Recovery  Downtime  Fault Tolerance  High Availability  Virtual Machine  Virtualization 

| More



Tuesday, August 24th, 2010 - 5:27 pm EDT

Effective Risk Assessment: Q&A

Posted by: Michelle Liro

We had a very lively presentation and Q&A during last week’s webinar “How to Cut Risks and Costs with a Downtime Analysis and Action Plan.” A summary of the Q&A is below.

Q: Should branch offices be included in a downtime assessment?
Absolutely – you can’t ignore branch offices. Forrester estimates that 20% of your business comes from branch offices. IT needs to make sure to include those in your assessment plans and budget.

Q: How often should I conduct a business and risk impact assessment?
We’ve found with our customers that an annual assessment is usually sufficient, unless you have some significant kind of change like an acquisition or new location. In that situation you obviously need to do a refresh. You can then use that info moving forward as you conduct your annual assessment.

Q: Is there any available information about rough cost estimates of down time impact in control systems like DCS or SCADA and Historians like the one you showed for IT systems in one of your slides?
We work with a number of ISVs in the process control space including GE, Johnson Controls, Rockwell and many others. We conducted an assessment in a pharmaceutical plant where one minute of downtime lead to the discard of an entire batch, which resulted in a loss of $950,000 to $1.1 million. In process automation and process control, downtime also effects efficiency. We had one company doing waste water treatment and they couldn’t handle the processing levels because of the downtime that they were having, and they were considering opening up a second facility. The assessment revealed that they could actually just retool their existing applications to increase their efficiency and not have to open a second facility. There’s a huge safety element here as well. When some types of systems go down, it can cause significant safety hazards to employees and others. This should also factor in to your downtime risk assessment.

Q: What about hosted applications? How can I incorporate those into my assessment?
Very often, some of your most critical applications are no longer hosted at your site. There’s still obviously extremely important to the business and need to be included as part of your assessment. Treat them exactly the same as your on-site applications, but just make sure that the vendor has the protections in place to keep your applications at the necessary levels to ensure their availability.

Q: With the increased reliance on the Internet, how do you factor the loss of the Internet (i.e. nationwide cyber attack) in risk/mitigation planning?
What we covered in the presentation is mostly what’s under your control, but you do also need to factor in security needs as well. Look at the areas out of your control as well. For example, what would happen to your business if my internet connection is down? Should you have a secondary carrier? ARe you going to go from a T1 connection to some other kind of connection?

Q: Are Marathon’s assessment services delivered primarily as a way to introduce Marathon software into the account, or do you sometimes recommend other software solutions that may be a better fit?
It depends on what you need. Sometimes we’ll go into an organization and do an assessment and they’ll have applications that aren’t necessarily mission critical and they can deal with several hours or days of downtime. What they already have in place might be acceptable for that situation. Or they may be in a situation where they just need disaster recovery. For the instances where there are mission critical applications involved and they can’t tolerate downtime or data loss that’s where we come in.

Q: Would you ever recommend the use of cloud-based VMs for disaster recovery?
It depends on your needs. When you look at the spectrum of availability, there are just so many buzz words and acronyms out there. Fault tolerance, high availability, disaster recovery, business continuity, replication, and on and on. There are efficiencies with cloud-based DR, but the reality is that a lot of these services use a “recovery” model, which means there is downtime involved. These type of services don’t keep your business going during an outage, they just help you to recover after the fact. At Marathon, our focus is on the prevention of downtime and the continuation of business.

Q: Is there a tactic (rule of thumb) you'd recommend to avoid departments classifying everything as mission critical, as everyone believes there app is mission critical.
Every department likes to think that their particular applications are critical to the business. This is why companies like to engage third parties to help them with this process. Companies like Marathon can come in with an objective perspective, ask detailed questions, and provide guidance without any of the internal politics getting in the way.
 

Show Discussion / Comments (0)
Webinar  Downtime 

| More



Thursday, August 5th, 2010 - 10:18 am EDT

Why Settle for Less?

Posted by: Rob Ciampa

Restlessness and discontent are the first necessities of progress.
-- Thomas A. Edison


“So I hear you’re going to Marathon,” a good friend said to me over the phone just before I started here recently. “I understand they’ve got some really cool technology to prevent downtime. I could probably use it, but I’ve just come to accept regular downtime as a fact of life, especially with applications such as email.”


“What do you do when you're down?” I inquired.


“Not much.” He responded. “As a manager, I rely on email for much of my project-based activities. But the outages seem to last much longer than one would expect. The last one was several hours.”


“That,” I said, “is precisely why I found the Marathon opportunity so interesting. It seems as if everyone is settling for less and that downtime is OK. Where’s the discontent? The angst? Finally, we’re starting to see the frustration level rise. Our world has fundamentally changed and our reliance on technology is a critical element of many things in our lives. When one of the critical apps on my mobile phone isn’t working, I’m down too. Market expectations and reality are misaligned, which creates a great opportunity for Marathon, its customers, and its prospects. That’s incredibly exciting. Plus they have thousands of customers. Those are quite a few proof points.”
 

Over the years, I have been an engineer, an entrepreneur, a systems integrator, a consultant, a channel manager, and a marketer. I built radars, operating systems, computing platforms, embedded systems, data centers, corporate networks, carrier networks, and worldwide internet service providers. In every single project – large or small – my teams and I always had the fundamental challenge: how do we keep the systems running? How do we keep them available? In some instances, lives depended upon uptime.
 

What did we do to address this? We just focused on recovery in the event of failure(s) because we never had a prevention mindset. That’s the Marathon differentiation that I found so appealing. It’s a powerful model. When I heard about customers such as Abercrombie & Fitch running for years without an outage or data loss, I quickly realized that prevention works. Why settle for less?

 

Show Discussion / Comments (0)
Availability  Downtime 

| More



Monday, August 2nd, 2010 - 11:36 am EDT

Top 5 Low-Cost Tips for Preventing Exchange Downtime

Posted by: Michelle Liro

Thanks again to everyone who joined us for last week’s webinar “Top 5 Low Cost Tips for Preventing Exchange Downtime” where Marathon’s availability experts reviewed their key tips for the prevention of downtime, including:


1. Reduce human error with process
2. Document your infrastructure
3. Remove single points of failure
4. Don’t forget to test
5. Understand your requirements


There’s a lot of great information in this 40-minute webinar, so be sure to check it out. We’ve summarized the Q&A portion for the webinar below.
 

Q: What type of storage does everRun support?
everRun supports any type of storage that you have. The most common storage configuration we see is local disk drives for the servers themselves. That would have the same amount of data protection as even a SAN would, and in some cases would be even better protection, because you have total redundancy from both servers and everRun is protecting that as if it’s a single storage device. You could also have iSCSI connected storage, or any kind of SAN storage that you wanted to have. Again, everRun supports any type of storage.

Q: Which versions of Exchange do you support?
The beauty of the everRun architecture is that is can support pretty much any Windows-based application. Exchange 2003, Exchange 2007 or Exchange 2010 – everRun supports them all. Some other high availability solutions require specialized scripting to support applications, but everRun does not require this. Also, with solutions like clusters, sometimes you have to buy the higher-end more expensive “enterprise” versions of the application software to support that configuration, but with everRun, we can provide complete protection for the standard versions of Windows and Exchange Server or any other application.

Q: What is the load on the systems when using everRun?
The good news here is that there is very little overhead associated with everRun – about 5% to make things run redundantly. That’s a very small performance price to pay to get such a high level of protection for Exchange.

Q: How does everRun handle the mirroring of data that’s loaded in memory?
There are a couple of ways that is done. Since the application is actually running both severs simultaneously, that means the memory is being replicated on both servers simultaneously as well. Keep in mind that as the applications execute, they are storing to storage, and because of the redundancy built into the everRun solution, that data is being written from memory down under the storage element redundantly as well.

Q: Is it possible to run servers in two different locations?
Absolutely. In the slide where I showed the everRun architecture with the two servers, you can take those two servers and separate them geographically. They could be in different rooms in the same building, different buildings on the same campus, or even separated further, by about 100 miles, depending on the bandwidth and latency of your connection. We call this our SplitSite configuration.

Q: How is this different from a cluster solution?
The major difference of everRun vs. a cluster solution is that we are doing operations on two servers simultaneously. The application is actually running in tandem on both of these servers. With a cluster solution, you’re running your application on one server, while the other server stands by and waits for a failure to occur. That means that with a cluster solution, when the first server fails, the cluster then has to do something to start up the application on the second server and then continue from that point. But that means downtime, data loss, and loss of connectivity. With everRun, that doesn’t happen. Because the other server is already doing the same thing, there is no downtime and no data loss, because there is no “recovery” – even when there is a failure.

Q: So are both servers “hot” in an everRun configuration?
Yes – that’s exactly right. Both servers are active and run simultaneously, unlike a cluster. So with everRun, you could have a failure of a component on one server and then another type of failure on the second server and still be operational. With a cluster, this scenario is not possible. If you have failures on both systems at the same time with a cluster, then you are down.

Q: Does everRun require dedicated servers just for Exchange?
No – everRun protected servers do not need to be dedicated to one specific application. You can run multiple applications on this pair of servers, and even chose which ones you do or don’t want to protect with everRun. This is good for small businesses, who want or need to consolidate several applications on to fewer servers.

For more information about protecting Exchange from downtime, be sure to check out our white paper "Six Secrets to 24x7 Exchange Availability."
 

Show Discussion / Comments (0)
Webinar  Downtime  Exchange  Fault Tolerance  Interview  Webcast  Windows 

| More



Thursday, June 17th, 2010 - 1:37 pm EDT

How to Cut Risks and Costs with a Downtime Analysis & Action Plan

Posted by: Michelle Liro

Earlier this week, we hosted a webinar on the topic of “How to Cut Risks and Costs with a Downtime Analysis & Action Plan.” We know from our experience in application availability that many companies avoid these types of assessments – they either don't know where to start or decide that they don’t have the time or experience to conduct an assessment, so they just live with the unknowns and hope that nothing bad happens. (We’ve seen the consequences of downtime at many companies and don’t recommend this method!)

Our VP of Services & Support, Beth Shea, explored this topic in detail and provided a simple framework that companies can use today to uncover their risks and put measures in place to minimize the impact of downtime. To learn more, be sure to watch the 30-minute webinar. You can also check out the Q&A session from the webinar, summarized below.

Q: When looking at the impact of downtime, it is just unplanned downtime, or should you include planned downtime as well?
You absolutely need to plan for both planned and unplanned downtime, as there’s a real cost and business impact to both. They both need to be included in your impact assessment.

Q: What about branch offices – should they be included in a downtime assessment?
According to Forrester Research, about 20% of a company’s business is tied up in branch and remote offices, and IT needs to include these offices in any assessment that they are conducting. You shouldn’t overlook these offices when putting together your downtime and business impact assessments. They have to be factored in.

Q: How often should I conduct a business impact and risk assessment?
What we’ve found with our customers is that conducting an annual assessment is sufficient, or in some cases, twice a year, depending on the type of business. You can then use these as your benchmark going forward to determine the success of the initiative and ensure that you have the key metrics to report to your management team.

Q: How do you determine when to use local high availability vs. a disaster recovery solution?
Fault tolerance, high availability, disaster recovery - all of these different terms can be confusing and they can have different meanings to different people. The way we think of this is that when you’re implementing high availability or fault tolerance this is to ensure that locally you are protected against the everyday, nuisance failures that cause downtime. If you lose a fan or a drive for example, you would automatically route to another server within the same building or local area. Disaster recovery solutions are really for recovery from catastrophes (fire, flood) or other events where you need to failover to a much more distant location. You don’t want to use this type of solution for everyday failures, as it can be very time consuming to failover and failback, and you can potentially lose some data. For local protection, you want high availability/fault tolerant solutions.

Q: What about hosted applications like salesforce.com, how do I account for those in this type of assessment?
In today’s world, so many applications are offered as Software-as-a-Service (SaaS) or sometimes called hosted applications, where they are no longer hosted at your site. However, they are still important to your business and need to be included as part of your overall assessment. Our approach is to conduct the assessment for your SaaS applications as if all they were onsite. Then use your tiered analysis and make sure that your SaaS vendor is meeting your availability requirements for that application, and that they have the necessary protections in place to protect that application to the same level that you would protect if it were in-house.

Q: Does Marathon offer any services to conduct this type of assessment?
Yes – this is a service that we provide for our customers. Most customers are very satisfied with the service, because it usually has an immediate ROI for their business. If you are interested in this type of service, please feel free to us at 978-489-1100.

Q: Does Marathon have any templates available to build a framework for this type of assessment?
Absolutely. From our 16+ years of working with customers on the assessment and prevention of downtime, we’ve put together an extensive list of questions to ask about the business risks and impact of downtime. Please feel free to contact us if you would like more information.

Q: How do you measure or put a price on the intangible impacts of downtime?
This can be tough to nail down, but what we recommend is developing some basic estimates. This isn’t meant to be an exact number, what we are really trying to achieve here is to prioritize applications, put them into the tiers that we discussed and make sure that you are putting the right amount of resources against the right applications. From a productivity perspective, one metric you could use is to look at the cost of employee salaries and how much it would cost in salary costs to have employees not be able to work for a certain amount of time. This is just one example.

Q: Does everRun handle quick switch over to back up site if the main site goes down?
Yes, within seconds.

Q: What are the requirements for the backup site?
The machines at the backup site are in the same pool as the primary site, so the backup machines must meet the requirements to be in the same pool as the primary site machines.

Q: How about regular data sync between main site and backup site?
Since the primary and backup site are running in lockstep mode, the application and the data are always in sync between the primary and backup sites.

Show Discussion / Comments (0)
Downtime  Availability  Disaster Recovery  Fault Tolerance  High Availability  Interview  Webinar 

| More



Tuesday, March 9th, 2010 - 10:52 am EST

Q&A with Craig Resnick of ARC Advisory Group

Posted by: Michelle Liro

Next week Craig Resnick, research director and automation expert at ARC Advisory Group will be the guest speaker for our webinar "Best Practices for Preventing Downtime in Automation Systems."  We recently sat down with Craig to discuss some of the recent trends in the manufacturing and automation industries.

Q: What are some of the newer trends that you are seeing in the automation space?

Craig Resnick: A primary trend that we see at ARC is the convergence of automation and IT systems. Nearly every manufacturing company uses a variety of plant automation and enterprise IT systems to manage its operations. Plant floor systems, such as distributed control systems (DCS), programmable automation and logic controllers (PACs/PLCs), and a wide range of plant floor applications provide a wealth of real-time information regarding productivity, efficiency, equipment health, capability, and quality. Business systems, in turn, provide information on raw material costs, product orders and inventories, manufacturing resources, production schedules, etc. This wide range of information often remains isolated in systems such as manufacturing execution systems (MES), laboratory systems, maintenance systems, scheduling systems, enterprise resource planning (ERP) systems, supply chain management (SCM) systems, and customer relationship management (CRM) systems. Decisions based on data from any one of these system will always be less than optimal because, without the corresponding information from the other systems, the information will be incomplete.

To close this gap between automation and IT systems, and to address the trend of the plant floor becoming more IT-centric, ARC has defined a new space, defined as Collaborative Production Systems. These new systems consist of platforms in which the controls layer domains of process, logic, motion, building automation, and power control systems converge with the information layer domains of production management and MES systems. These converged systems enable, for example, the required data and information to be directly tied into applications such as corporate reporting and manufacturing compliance. Collaborative Production Systems will become the industrial blade server that provides full monitoring and control of the enterprise, from the office to the plant floor, sharing that information with the supply chain to, for example, procure materials and resources and purchase or sell power at the optimal times and prices from the smart grid, while providing full financial metrics and KPIs to ERP systems to maximize profitability.


Q: Now that corporate reporting and systems are heavily tied into the “factory floor”, how is that changing the need for system availability and data protection?

Craig Resnick: The need for system availability and data protection continues to expand, driven by a combination of issues ranging from global competition to regulatory requirements. Process safety and critical control are primarily focused on system availability and process uptime. As a specific example, take the Pharmaceutical industry, where data and batch information can never be lost or interrupted. System availability and data protection needs are also forcing E-records regulations to evolve across the globe. In the US, this includes 21 CFR Part 11, as well as the FDA’s Good Manufacturing Practice (GMP) and Process Analytical Technology (PAT) initiatives. In Europe, this includes Annex 11 of the EU GMPs, electronic Signatures Directive 1999/93/EC, and Data Protection Directive 95/46/EC. The European Data Protection Directive requires even more protection on data than the current FDA regulations and extends this requirement to clinical trials patients, as all clinical trials data requires maximum protection to remain compliant with regulations.

Unscheduled downtime is expensive. It often impacts production’s ability to meet its schedule and may cause missed customer commitments. Unplanned downtime, which also includes unexpected stoppages resulting from equipment failure, operator error, or nuisance trips, is the nemesis of all manufacturers. Statistics on the impact of unplanned downtime on plant operations show that it accounts for 2 to 5 percent of production lost in, for example, the petrochemical industry. Unscheduled downtime is also costly in terms of equipment damage, environmental harm, and worker safety. The cost of downtime is reflected in a primary key performance indicator (KPI) used by manufacturers known as Dynamic Overall Equipment Effectiveness (OEE), which helps determine the real-time impact of the performance of any individual process or piece of equipment on the overall efficiency of the plant. Unscheduled downtime is a primary factor that significantly lowers Dynamic OEE, which translates to the manufacturer decreasing both its efficiency and profitability.

Q: What are some of the basic steps that companies can implement to ensure the availability of their systems?

Craig Resnick: The first step that companies can implement to ensure the availability of their systems is to maximize their operator’s effectiveness in the control room, which is essential to minimize the risks of accidents, eliminate unscheduled downtime, and maximize production quality. The global process industry loses $20 billion, or five percent of annual production, due to unscheduled downtime and poor quality. ARC estimates that almost 80 percent of these losses are preventable and 40 percent of those preventable losses are primarily the result of human or operator error. Maximizing operator effectiveness requires automating as many functions as technology will allow, as well as reducing complexity wherever possible. For example there are still many plants where operators monitor the processes and collect data manually or semi-automatically using chart recorders. This process is both tedious and error prone, and does not provide appropriate process insight or instill a sense of ownership among the control room operators.

The Abnormal Situation Management Consortium (ASM) points out that most incidences occur from multiple modes of failure. Preventable human error is a contributing factor to these losses, but is hardly the only cause. Preventing abnormal situations requires a multilayered multi-discipline approach focused on maximizing production throughput, efficiency and quality while minimizing lost production time and preventing damage to assets and endangerment to personnel. This approach requires deploying collaborative production systems designed and implemented to be able to deliver high levels of availability and fault-tolerance expected from any other mission critical industrial system. This typically requires effective data backup mechanisms, redundant controllers for critical applications, plus industrial grade software. Manufacturers are also deploying more fault tolerant server technology to ensure continuous availability of these mission critical applications; the continuous flow of vital products to the market; and the avoidance of the potentially negative financial, social, or environmental impact that operating without high availability fault-tolerant systems might bring.

 

To learn more about preventing downtime in your automation applications, be sure to attend next week's webinar where Craig will provide expert info on steps for reducing the human error that leads to downtime, how to protect your hardware, storage and networks for complete availability coverage, and how to protect against a complete site failure. You can register here.
 

Show Discussion / Comments (0)
Manufacturing  Downtime  Fault Tolerance  High Availability  Interview  Webcast  Webinar 

| More



Wednesday, October 14th, 2009 - 11:19 am EDT

4 Simple Steps to Reducing Downtime

Posted by: Michelle Liro

We had a fantastic presentation last week from IT expert and author Niel Nickolaisen. Niel shared his proven methods for reducing downtime and improving the alignment of IT resources to better support business goals. If you weren’t able to attend the live event, you can watch the recorded version here.

If you prefer a white paper format, Niel’s strategies and best practices have also been summarized in a brand-new 8-page white paper, “Reduce Downtime by 70% - Without Spending a Dime” which you can download here.

The Q&A session from the live webinar with Niel Nickolaisen and Michael Bilancieri of Marathon has been summarized below:

Q: Can you give some tips on how I can educate my branch offices about my business continuity plan?
Niel Nickolaisen, CIO: At Headwaters, Inc., we have 120 remote sites. We approached this from an SLA perspective. We translated how the SLAs affected the operations at our branch locations. Then we communicated it and got them to buy into the SLAs and the things we were doing and suggested that they followed our lead.

Q: How often should you update your disaster recovery plans?
Niel Nickolaisen, CIO: In our case at Headwaters, Inc., we have Sarbanes-Oxley regulatory requirements. We do an annual formal risk assessment both for our business and for IT. When we’re done with that assessment we update our disaster recovery plans, which are based on the risks. Our disaster plan is designed to mitigate or recover from the risks that we’ve identified.

Q: How does everRun work?
At a high-level, everRun takes your entire Windows environment and protects it as a whole. Most protect from within the OS but we protect from underneath the OS. We clone to a second system for redundancy in a synchronous fashion. A good way to understand how everRun works is to watch our product demos videos and flash demos available on our website.

Q: How does everRun fit into a virtual environment?
everRun allows the ability to create multiple workloads on a single server. Our technology is based on virtualization technology – we’re virtualizing two instances to appear as one. You can create multiple workloads and put them on the same server and protect them. It’s based on Citrix XenServer.

Q: Will this work in conjunction with SAN offhost backups using Vertias Netbackup and FlashSnap option?
We are agnostic to the storage. If you’re using back-up right from the SAN, that’s fine. You can also use a mirrored option, where we can mirror the entire system in a synchronous fashion. That allows you to have SAN on one side and NAS on the other, or direct-attached, or both. It’s your choice, which gives you greater flexibility. You can separate the servers as well between buildings. The other option is a single copy of storage, not mirrored and both systems can connect to that storage, but the SAN device will then have to protect the data.

Q: How can Marathon contribute to companies considering a move to SAP?
everRun can provide availability and fault tolerant protection to that SAP environment. If you’re considering a move to SAP, I would assume you have had some discussions about how to protect that—the SLA, the data, availability and disaster recovery. everRun can protect and provide disaster tolerance disaster recovery, and high availability for that application, as well as data protection. We don’t cause any changes to the application.

Q: Should Marathon be brought in as a consultant before SAP is contracted?
Sometimes it’s a good idea to have a joint discussion with vendors. A lot of times when you look at availability and redundancy or data replication, it’s doing things to the applications and data and can cause interaction issues. Sometimes the application has to be configured in a certain way, so you want to know up front how your high availability solution could affect the data and application. We can certainly do a call with any other software vendors to have that conversation up front.

Q: What version of Windows does everRun support?
everRun supports Windows Server 2003 32-bit and 64-bit and Windows Server 2008 64-bit.

Q: What kind of performance impact does the synchronous lock-step have on the system?
That varies by application, users, data, I/O, and other factors. In general, it can range from 10-20% on your application – we’ve seen less than that and more than that, depending on the system.

Q: Do you recommend WAN optimization to be used?
Our requirements are around bandwidth between the two systems if you want to separate the systems. WAN optimization tools don’t always help. It’s really a latency requirement to maintain good performance.

Stand Back and Deliver: Accelerating Business Agility 

If you'd like to learn more about Niel's best practices for aligning business and IT resources, be sure to check out his new book, Stand Back and Deliver: Accelerating Business Agility.

 

Show Discussion / Comments (0)
Downtime  Availability  EverRun  High Availability  Webcast  Webinar 

| More



Friday, October 2nd, 2009 - 10:17 am EDT

Using a Gap Analysis to Reduce Downtime

Posted by: Michelle Liro

Congratulations to Thomas Burgdorf of Mii Management Group, the winner of a $100 American Express gift card from our recent everRun 2G demo webinar. If you weren’t able to attend the live event, a recording the everRun 2G demo webinar is now available for on-demand viewing here.

Be sure to join us for our next webinar on Oct. 8th, featuring IT process expert and author Niel Nickolaisen. We're really excited to have Niel as our guest speaker for this webinar. In addition to his 25+ years of IT experience, Niel is the CIO and Director of Strategic Planning at Headwaters, Inc. and also writes regular columns for the CIO Leadership Network and TechTarget's Search CIO. Niel is going to share his proven methods for reducing downtime, including:

* Conducting a gap analysis of your current IT processes
* Identifying weaknesses that can lead to downtime
* Simplifying IT processes so that your entire staff can understand and follow them

We're expecting a large group for this webinar, so be sure to register today to reserve your spot.


 


 

Show Discussion / Comments (0)
Webinar  Downtime  EverRun  High Availability 

| More



Wednesday, September 23rd, 2009 - 10:09 am EDT

Protecting SQL Server from Downtime

Posted by: Brian Mullins

In recent months, Marathon has put together a series of toolkits with materials on reducing downtime and data loss, including toolkits for Citrix XenApp and Microsoft Exchange 2007.

Our latest toolkit is now available, this time for Microsoft SQL Server. Protecting SQL Server from downtime has become even more critical in recent years, as businesses run more of their critical systems, including electronic commerce, online banking, just-in-time manufacturing and streaming media (just to name a few) on SQL.

This toolkit includes materials on SQL Server high availability in both physical and virtual environments.

White paper: 5 Secrets to SQL Server Availability This paper reviews five proven secrets to affordable SQL high availability that will help IT managers implement a SQL Server environment with little or no downtime - and zero data loss.

White paper: The SSWUG.org Increasing Reliability and Availability in a Virtualized SQL Server Environment white paper, authored by Microsoft SQL Server MVP Stephen Wynkoop, provides IT professionals with best practices and considerations for designing and implementing a virtualized SQL environment including:

• Potential pitfalls to avoid when virtualizing SQL Server
• How to increase reliability and availability of a virtualized SQL Server environment
• A SQL Server virtualization case study (Sullivan Group)

On-Demand Webinar: SQL Availability: Protecting your Database and Applications featuring Microsoft SQL Server MVP Stephen Wynkoop, helps IT administrators understand SQL back-up and restore options. Wynkoop also presents his Concentric Rings of Recovery plan, which covers the four levels of preparedness for local, alternate, off-site and remote locations.

Also, be sure to check out some addtional SQL Server resources, including SQL user groups, SQL Server job boards, SQL MVP blogs and Twitter feeds, and other SQL-related info.
 

Show Discussion / Comments (0)
SQL  Availability  Downtime  High Availability 

| More



Tuesday, July 21st, 2009 - 5:35 pm EDT

Q&A with David Hanna of Microsoft

Posted by: Brian Mullins

If you’ve been thinking about upgrading to Windows Server 2008, be sure to attend our July 30th webinar featuring guest speaker David Hanna, Information Architect at Microsoft. David will review the new Web tools, virtualization technologies, security enhancements, and management utilities available in Windows Server 2008. You’ll also have a chance to ask David any specific questions you have about Windows Server 2008 during the live Q&A portion of the webcast.

In preparation for the webinar, we asked David to answer a few of the common questions that we have been hearing from our customers in recent months.

Q: One of the biggest concerns we hear from our customers and partners is that in this current economy, IT departments are being asked to do a lot more with less people. How can Windows Server 2008 help with this issue?

Across all of my customers, everyone is talking about cutting costs, and getting more out of their current investments. When we start digging into the features of Windows Server 2008, customers are finding tremendous opportunity to optimize their environments. A few of the major areas of cost savings I’m seeing are:

  • Reduced deployment time and costs with Windows Deployment Services
  • Reduced management cost and effort with PowerShell and Server Manager
  • Hardware and Workload Consolidation with Hyper-V
  • Licensing consolidation with Enterprise and Datacenter models for virtual environments.

Q: What about the challenge of managing remote and branch office locations?

Branch offices have consistently been a challenge to manage, primarily due to lack of on-site staff. Windows Server 2008 brings some major new components to the picture that will greatly ease branch office management. These features include the Read-Only Domain controller, which makes the remote DC secure, and replaceable, Distributed File System, Windows Remote Management, Server Core (lower surface attack area), and improved Terminal Services for application delivery.

Q: A lot of our customers work in “always-on” industries like manufacturing, healthcare and broadcast media, where server downtime can be very disruptive to their business. How does Windows Server 2008 support these demanding environments?

Windows Server has always addressed high availability with Clustering Services. Windows Server 2008 has brought some huge enhancements to the Cluster Service that will reduce the complexity of clustering, while increasing availability. Failover Clustering in Server 2008 has a new validation wizard that will validate hardware and software configurations, resulting in easier, more reliable cluster deployments. The reliance on a quorum drive has also been removed, so there is no longer a single point of failure in the cluster. Also, Failover Clustering has been enhanced to support multi-site clusters to support organizations that need site-to-site failover. And, as always, when organizations need to take availability to the next level, Microsoft continues to work with partners like Marathon to extend the native capabilities of Windows Server.

***********************************************************************************************

During the webinar, Michael Bilancieri, Sr. Director of Products for Marathon, will discuss how to extend the high availability features of Windows Server 2008 to fault tolerant protection with Marathon’s everRun software and how organizations can now confidently migrate mission critical applications from Unix or proprietary platforms to realize big cost savings.

Registrations for this webinar are limited and we are expecting a large turnout, so be sure to save your spot by registering today.


 

Show Discussion / Comments (0)
Webinar  Availability  Clustering  Clusters  Downtime  EverRun  Fault Tolerance  Fault Tolerant  High Availability  Webcast 

| More



Wednesday, April 1st, 2009 - 7:51 am EDT

everRun and Exchange 2007 Mailbox Servers

Posted by: Tom Reed

When planning your VM workloads, you should be aware of what level of availability each server will need. By splitting the amount of users across multiple VM’s you can provide a level of availability to each set of users based upon your SLA with each business unit in your company. Looking back to the availability pyramid you can choose which level of availability for each mailbox server is needed. For example if you have an executive group that needs to be up with a 24/7 uptime and only limited downtime then level 3 should be your selection on a separate mailbox server. If all of your business units require the same level of availability and have the same SLA in place then you will split your mailbox servers according to usage. Using the chart from section one we can split the users based upon the type of user. For example if you have 1,000 heavy users we would assign 2 vCPU’s to the virtual machine. Always follow Microsoft best practices when deploying the amount of users per core or vCPU.

If we look at Figure 1 we can see that that we have 4 active VM’s spread across two servers with 2 vCPU’s assigned to each. Looking at the example chart above and using figure one we can see that this design example would support 4,000 “Heavy users”. We achieve this by allowing our storage groups on each mailbox VM to support 1,000 “Heavy” users.

Distributed workload across 2 servers

Let’s take a look at a basic design with 3 separate types of users spread across 4 servers. We have an executive mailbox store, a mid-management store, and a general user store. In looking over what each teams HA requirement is we have come to the following, the executive team needs 24/7 up time with no downtime except for a maintenance window once a month. The mid-management team can handle some downtime, but only a few minutes each week. The general users have no HA requirement they can be down for an hour a week if needed. So how do we decided what level of availability we would like to use, it’s easy we simply look at the application availability pyramid and we put the appropriate mailbox store at each level:

By using this simple plan you can simplify you’re HA strategy for Exchange. By distributing the mailbox stores across multiple servers on the same hardware you can save rack space as well as provide individual levels of availability based upon different business unit needs.

Show Discussion / Comments (0)
Availability  Downtime  EverRun  Exchange  Exchange 2007  Marathon  Virtual Machine 

| More



Wednesday, December 10th, 2008 - 7:11 am EST

Webinar: Assessing the Impact of Planned and Unplanned Downtime in the Contact Center

Posted by: Brian Mullins

Business continuity planning ranks among the top trends in a recent Dimension Data report on contact center technology. Yet many call centers aren’t equipped to deal with unexpected downtime from a system failure. These centers would lose productivity and sacrifice service levels when mission-critical tools like real-time reporting systems go dark.

Real-time reporting provider Inova Solutions, along with new partner Marathon Technologies, will host a webinar to discuss best practices for business continuity and high availability in the contact center. Presenter Scott Thompson from Marathon Technologies will discuss how to protect your real-time reporting investment from costly downtime and data loss.

Participants can register for the webinar here. Details are below:

What: Webinar: “Assessing the Impact of Planned and Unplanned Downtime in the Contact Center”
When: Wednesday, December 10, 2008, 2:00 pm EST

via Inova Solutions website.

Show Discussion / Comments (2)
Availability  Business Continuity  Downtime  High Availability  Marathon  Partners  Webinar 

| More



Monday, November 24th, 2008 - 3:11 pm EST

UNDERSTANDING DIALABLE AVAILABILITY

Posted by: Brian Mullins


As many of you know, one of the key components of everRun VM is the ability to dial up or dial down the level of availability needed to protect business-critical applications. With buzz surrounding the release of Citrix’ XenServer 5, we have been approached with questions like “what should I use to protect my low-priority applications” and “how do I know when something should or shouldn’t be protected with the lockstep option?” To help explain the three levels of availability and when they would be used, we’ve put together these tips:

LEVEL 1: BASIC FAILOVER WITH XENSERVER HA
The first level of availability, basic failover and recovery, is appropriate for applications where recovery is not absolutely critical, and where manual intervention, while not desirable, is acceptable. These may include infrastructure applications or dev and test systems.
XenServer HA provides:

  • Basic failover to another host within the same Xen pool, with resource calculation to determine whether adequate resources are available within the pool to handle a defined number of simultaneous host failures (XenServer HA does not check the health of available devices, such as network and storage)
  • Monitoring of health of the hosts within a pool (Network and storage health are not monitored)
  • No storage or data protection – using this level requires a shared-storage configuration

LEVEL 2: COMPONENT-LEVEL FAULT TOLERANCE WITH everRun VM
For applications with business-critical roles, everRun VM provides component-level fault tolerance: the ability to withstand the loss of an individual network or storage component without interruption or downtime.
The attributes of Level-2 availability include:

  • Automated setup and fault management: policies handle system, network and disk I/O failures without IT intervention
  • Assured recovery of virtual machines
  • Zero downtime due to I/O failures and zero data loss
  • Synchronous data mirroring between hosts; no need for shared storage
  • Continuous active validation of all components on production and standby system to ensure complete redundancy at all times for recovery in the event of a failure
  • Comprehensive availability including system, network, and data availability, all in one integrated solution

LEVEL 3: SYSTEM-LEVEL FAULT TOLERANCE WITH everRun VM AND LOCKSTEP OPTION
For the most mission-critical systems, Marathon everRun VM with Lockstep Option provides system-level fault tolerance, with continuous availability in the face of component or system-wide failures. Level 3 will be available in 2009 and offers protection for systems that cannot experience any downtime and must maintain transaction state at all costs. everRun VM with Lockstep Option offers all of the benefits of everRun VM (Level 2), together with:

  • Zero downtime even for complete host failures
  • Application state maintained during failures
  • Memory state maintained during failures

For more information on the different levels of availability please visit here.

Show Discussion / Comments (0)
Availability  XenServer  Citrix  Continuous Availability  Downtime  EverRun  EverRun VM  Fault Management  Fault Tolerance  Marathon  Virtual Machine  XenServer HA 

| More



Thursday, November 13th, 2008 - 8:46 am EST

How A Large Furniture Retailer Benefitted From Protecting MS Exchange

Posted by: Brian Mullins

Every day, companies around the world rely on the features of Microsoft Exchange for their business-critical applications like email, calendaring, contacts, mobile support, web-based information accessing and data storage support. While we’ve discussed the importance of maintaining Microsoft Exchange high availability and steps to simpler Exchange HA, we thought this would be a good opportunity to share a case study from one of our customers.

Connecting Employees, Vendors and Customers Without Interruption

A large U.S. furniture retailer was becoming increasingly dependent on Microsoft Exchange 2003 for internal communication and collaboration, and for communication with both vendors and customers. Since the retailers primary revenue-generating activities relied on e-mail, downtime would have resulted in serious consequences. As a result, Exchange protection became a requirement and top priority for senior leaders.

everRun –An Alternative to Clustering

The IT staff had previous experience with traditional clustering and was looking for an easier, more robust solution. They selected everRun and implemented a solution using a pair of IBM servers with local boot disks and a fibre channel SAN for the datastore. Currently, the system supports close to 1200 users.

No Exchange Failures = Increased Competitive Advantage

In over a year of operation, they have not experienced a single unplanned disruption of their Exchange system. In turn, this has allowed them to keep their revenue-generating activities operating at full speed. As a result, they are currently looking at adding the everRun SplitSite option to allow geographical separation of their systems for additional protection.

Do you have a story when protecting Exchange would have been a better option than what resulted? How did it affect you or your company?

Show Discussion / Comments (0)
Case Study  Clustering  Downtime  EverRun  Exchange  High Availability  Marathon  SplitSite  Virtualization 

| More



Wednesday, October 29th, 2008 - 7:21 am EDT

FIVE STEPS TO SIMPLER EXCHANGE HIGH AVAILABILITY

Posted by: Michael Bilancieri

As we noted in our last post, Exchange High Availability has become increasingly important to businesses of all sizes. To help you get started, we’ve put together these five tips, which are easily-digestible pieces from our “Protecting Microsoft Exchange in Physical and Virtual Environments” white paper.

STEP ONE – PROTECT AGAINST SERVER FAILURES WITH QUALITY HARDWARE AND COMPONENT REDUNDANCY

Server core components include power supplies, fans, memory, CPUs and main logic boards. Purchasing robust, name brand servers, performing recommended preventative maintenance, and monitoring server errors for signs of future problems can all help reduce the chances of Exchange downtime due to catastrophic server failure.

Downtime caused by server component failures can be significantly reduced by adding redundancy at the component level. Examples are: redundant power and cooling, ECC memory, with the ability to correct single-bit memory errors, and combining Ethernet cards with RAID.

STEP TWO – GET RID OF STORAGE FAILURES WITH STORAGE DEVICE REDUNDANCY AND RAID

Storage protection relies on device redundancy combined with RAID storage algorithms to protect data access and data integrity from hardware failures. There are distinct issues for both local disk storage and for shared, network storage.

For local storage, it is quite easy to add extra disks configured with RAID protection. A second disk controller is also required if you want to protect against controller failures.

Access to shared storage relies on either a fibre channel or Ethernet storage network. To assure uninterrupted access to shared storage, these networks must be designed to eliminate all single points of failure. This requires redundancy of network paths, network switches, and network connections to each storage array.

STEP THREE – PREVENT NETWORK FAILURES WITH REDUNDANT NETWORK PATHS, SWITCHES AND ROUTERS

The network infrastructure itself must be fault-tolerant, consisting of redundant network paths, switches, routers and other network elements. Server connections can also be duplicated to eliminate failovers caused by the failure of a single server or network component. Take care to ensure that the physical network hardware does not share common components. For example, dual-ported network cards share common hardware logic, and a single card failure can disable both ports. Full redundancy requires either two separate adapters or the combination of a built-in network port along with a separate network adapter.

STEP FOUR – FORGET SITE FAILURES WITH DATA REPLICATION TO ANOTHER SITE

Site failures can range from an air conditioning failure or a leaking roof that affects a single building, a power failure that affects a limited local area, or a major hurricane that affects a large geographic area. Site disruptions can last anywhere from a few hours to days or even weeks.

There are two methods for dealing with Site Disasters. One method is to tightly couple redundant servers across high speed/low latency links, to provide zero data-loss and zero downtime. The other method is to loosely couple redundant servers over medium speed/higher latency/greater distance lines, to provide a disaster recovery (DR) capability where a remote server can be restarted with a copy of the application database, which only misses the last few updates. In the latter case, asynchronous data replication is used to keep a backup copy of the data.

Data replication is combined with error detection and failover tools to help get a disaster recovery site up and running in minutes or hours, rather than days.

STEP FIVE – CONSIDER VIRTUALIZING EXCHANGE FOR BETTER HIGH AVILABILITY

The latest server virtualization technologies, while not required for protecting Exchange, do offer some unique benefits that can make Exchange protection both easier and more effective. Virtualization makes it very easy to set up evaluation test and development environments without the need for additional, dedicated hardware. Virtualization also allows resources to be adjusted dynamically to accommodate growth or peak loads.

To help you make the business case for virtualization Exchange, we’re producing a live webinar with Citrix on November 11th: Virtualizing Exchange - The Cold, Hard Numbers on Why Citrix XenServer + everRun VM is the Best Platform. Register for the webinar here.

Show Discussion / Comments (0)
Downtime  EverRun VM  Exchange  High Availability  Virtualization  Webinar 

| More



Monday, October 27th, 2008 - 6:46 am EDT

The Importance of Maintaining Microsoft Exchange High Availability

Posted by: Brian Mullins

For most organizations, email is single-handedly the most important tool for accomplishing business objectives. Without access to email, companies are at an immediate disadvantage in today’s “I want it now” marketplace. For example, let’s look at the impact email downtime has on productivity: Assuming that your employees are 25% less productive when email is unavailable, and their annual salary is $60,000, then every hour of downtime for an organization of 500 people results in more than $7,200 in lost employee productivity. Can your organization bare a $7,200/hour loss? In today’s economy? Probably Definitely not.

Avoiding the aforementioned consequence is an option, but in order to do so you need to guarantee continuous availability for your organizations email server. According to Paul Rubens at ServerWatch, 2007 forecasts from Gartner revealed that Microsoft Exchange 2007 will own 70% of the email market share by 2010. Now, whether Microsoft will actually return those results, it’s still too early to tell. However, as more and more companies rely on Exchange servers to run business functions, all potential causes of unplanned downtime need to be identified and eliminated.

Over the next month, we will be providing you with some recommendations on how to improve Exchange high availability through planned and unplanned downtime – starting with a webinar on November 11 titled “Virtualizing Exchange – The Cold, Hard Numbers on Why Citrix XenServer and everRun VM is the Best Platform.” For this webinar, Jerry Melnick, Marathon CTO, and Matt Fairbanks, VP of Product Marketing for Citrix Virtualization and Management Division, will team up to discuss how the latest server virtualization technologies keep users continuously connected to Microsoft Exchange servers in the easiest and most effective manner. We encourage you to register online for the webinar if you haven’t already.

Is there anything in particular related to protecting your Exchange severs you would like us to address in the next few weeks? Leave us a comment below and we will be sure to put it on our radar.

Show Discussion / Comments (0)
Availability  Citrix  Continuous Availability  Downtime  EverRun  EverRun VM  Exchange  High Availability  Marathon  Virtualization  Webinar  XenServer 

| More



Friday, October 24th, 2008 - 11:38 am EDT

Asking the right questions to ensure the right solution

Posted by: Gary Phillips

As a result of economic turbulence, companies of all sizes continue to explore virtualization as an option for shedding costs. With the growing number of virtualization options available, it’s important not to let your organization fall victim to virtualization buzzwords. Not all vendors offer the benefits of virtualization, yet many claim they do.

With that being said, when deciding which solution to implement within your organization, IT decision makers should be prepared with an arsenal of questions to ask each provider – doing so will eliminate the typical “fluff” vendors use to sell their supposed virtualization solutions. Having all your questions answered will ensure that you get the most appropriate and highest quality solution for the applications you wish to protect. The following is a list of questions that might assist IT professionals in making their virtualization-related decisions, and some other considerations we offer:

  • Should I start to deploy on a small scale or implement everything at once? Answers will vary depending on the size and flexibility or your organization. It’s important that the vendor understand the nature of your business and the value of your critical data before making a suggestion. Whether you are a small, nimble organization with the ability to deploy on all critical apps, or a large enterprise with procedural requirements that prevent you from total deployment, the implementation strategy should be tailored to your needs. There is not a “one size fits all” virtualization strategy.

  • How much should I consolidate? We usually suggest phased deployment – start from scratch with the applications and environments that aren’t so mission critical, and then continue deploying as you see appropriate. It’s important to make sure that the vendor you have chosen can support your initiative.

  • If I do decide to consolidate, does the server virtualization option I have chosen also meet my application availability needs? Since the implications of downtime in virtual environments have become greater, understanding solutions used to protect business critical applications is crucial. Some important things to consider are:
    • Is the solution a “one-size-fits-all” approach, or does it offer flexible protection?
    • Does it support different levels of availability for your applications?
    • What will my cost savings be?

  • How am I going to manage the virtualization solution? The manageability of virtual machines is a different dynamic, especially if this is your first time dealing with virtual machines. The IT processes and management needs are very different. A plan for management must be in place in order to have a successful solution – otherwise you’ll find a lot of redundancy and the need for unnecessary maintenance.

  • What do I need for security? As higher applications are moved to the virtualization environment, security disciplines need to move as well.

These are just a few examples that should help get the conversation going. Has anyone deployed a virtual environment that wasn’t the right fit? What were the repercussions and what needed to be done to correct it?

If anyone has any questions they wish they had asked prior to purchasing, please leave them in the comments below and we will be sure to add them to the list.

Show Discussion / Comments (1)
Availability  Downtime  Marathon  Virtual Machine  Virtualization  Virtualization.info 

| More



Tuesday, August 26th, 2008 - 11:54 am EDT

Vehicle Manufacturing Executives Talk About everRun

Posted by: Brian Mullins

In the vehicle manufacturing industry, companies want an efficient and economical way to ensure smooth operation of all servers, software and applications. Any instance of unscheduled downtime could lead to a loss of data, or in a worst case scenario, to a complete disruption of production and services.

Serve customer needs online without interruption

One European vehicle manufacturer, who understands the importance of protection against downtime, has been using Marathon solutions since 2000. As their security needs as an organization have grown, so has their relationship with Marathon. They began by using the Endurance 4000 system to help protect their forklift management system. Three years later, they upgraded to everRun FT to further safeguard files and applications and to ensure continuous server availability.

The implementation of everRun FT gave the company the opportunity to undertake other IT projects to maximize efficiency and reliability. They were able to establish a centralized network to allow the entire staff to access all applications and system updates remotely.

Defend 24/7 operations with Marathon everRun FT software

With these new initiatives in place, one company executive says that it is now more important than ever for applications and servers to be accessible 24/7 – no matter what. “A disruption to the provision of data and applications would affect every employee, and in the worst case scenario, halt operations altogether,” said the executive.

The company uses both Marathon’s everRun FT and SplitSite to allow two servers to operate simultaneously in 100 percent lock-step. SplitSite provides an additional layer of protection against larger scale failures and disasters. This means that the two servers create a single virtual environment and if one fails, no downtime will occur and all software, applications and data will continue to run on the remaining server.

The organization utilizes several levels of security, including a single server, a Windows cluster, and a Marathon System, but all of their most important and mission-critical applications are operated on everRun FT.

Elimination of system failure and increased competitive advantage

Any instance of unscheduled downtime would impact not only the company’s main factory, but also their several hundred other outlets. If an employee was unable to connect to the network because the server was down, all data on customers and products would become unavailable – this could mean a stall on productivity and unhappy customers. The company executive maintains that this is no longer a concern, thanks to Marathon. “With everRun FT, we no longer have to worry about downtime.”

Show Discussion / Comments (0)
Availability  Case Study  Downtime  EverRun  Manufacturing  Marathon 

| More



Wednesday, July 30th, 2008 - 11:56 am EDT

Preventing Disaster Rather than Recovering from It

Posted by: Michael Bilancieri

We all like to think that we will be prepared in the event of an emergency, or a disaster. Hospitals exist if we fall sick; fire stations surround us if flames break loose; we are constantly preparing so if a catastrophe strikes, we are ready.

Preparing for a system’s disaster is no different. However, how to go about preparing for an event like this can be confusing. There are many options out there when it comes to protecting your system, each best suited for specific requirement. Unfortunately, many vendors use terms like disaster recovery and high availability interchangeably to describe their solutions when in fact they are usually designed for one or the other.

Disaster Recovery (DR) is the way to recover applications and from a system failure. DR is a reactive solution where if a failure occurs, IT relocates the data, builds the system over, and brings everything back up to working order. This takes time, a precious commodity that typically businesses relying on critical applications don’t have. In addition, recovering applications could bring about a number of side effects which you really don’t want to endure every time some minor failure happens.

But what if I could tell you that instead of worrying about how to recover from a computer system failing, you could simply prevent it from occurring at all?

Disaster tolerance (DT) is a proactive way to prevent system failure from impacting application and data availability. A disaster tolerant solution isn’t going to recover the data if there’s a disaster. Instead it will tolerate the fault if a disaster occurs – keeping an organization’s critical applications up and running at all times. It is not recovery, but rather prevention. And with solutions like our everRun SplitSite, separate servers don’t even need to be in the same building – they can be up to 100 miles apart with fault-tolerant protection between the two locations.

DR solutions are good for applications that can afford some downtime while you recover them. But for essential applications like Microsoft Exchange, SQL, and SharePoint, which need to be available all the time, disaster tolerance is often the best way to go.

So what combination of DT and DR protection would work best for your company’s applications?

Show Discussion / Comments (0)
Availability  CIO  Disaster Recovery  Disaster Tolerance  Downtime  EverRun  Exchange  Fault Tolerance  High Availability  Marathon  Sharepoint 

| More



Tuesday, June 17th, 2008 - 6:41 am EDT

Current HA Solutions Fail to Deliver What Customers Want

Posted by: admin

A research report by IDC’s virtualization guru, John Humphreys, The Future of Virtualization: Leveraging Mobility to Move Beyond Consolidation highlights the fact that the automatic restart used by most high availability solutions for virtualization fails to deliver what most customers really want and need. Here is what John has to say:

“To address unplanned downtime today virtualization companies are providing an automatic restart capability if the hypervisor or host go down for whatever reason. While this is a good start to trying to combat the lost revenue associated with unplanned outages, ultimately knowing what is happening at the hypervisor and hardware layers fails to deliver customers what they most want — application-level awareness and action. In this way, current HA solutions in the virtualization market are "blind from the waist up." That is, they do not know what is happening inside the virtual machine. They do not know if the operating system or application has stopped working, and that is ultimately what IT professionals charged with delivering application services most care to know.”

If you would like to learn more about high availability for virtualization, how to get application-level awareness and what that can buy you, we encourage you to join the webinar Thursday, June 26 at 11:30 EST. with John Humphreys (IDC), Simon Crosby (Citrix) and Jerry Melnick (Marathon).

For more information or to register visit here.

Show Discussion / Comments (0)
Availability  Citrix  Downtime  High Availability  Hypervisor  IDC  Marathon  Simon Crosby  Virtual Machine  Virtualization 

| More



View earlier posts in the archive