Ticketing/Entry System Crash

Unplugged

Well-Known Member
So is it naive to believe there should be a backup system, or is it naive to believe they should have procedures for what to do if a system goes down (or both)?
You are correct. Technically, their system should be running in a private cloud so if a computer (real or virtual) fails, another is in place to take over. This solution also handles load balancing during busy times. Now if someone "updated" production software with bad code, that is software without proper testing or data that was incorrect, that could have taken down a cloud system as well as the update should be pushed to the entire cloud. But proper management of the services would have allowed restoration quickly.

Of course, I'm sure they have a very robust solution in place and I don't know what I'm talking about.
 

WDWTrojan

Well-Known Member
Isn't that a magical moment? When Disney takes a negative and makes it a memory? How about free umbrellas? A souvenir and shelter from the Florida sun. Shouldn't there be misters at the gates? They do help. I've read they're all over in Tokyo Disney. I go a lot, never seen them at gates, a couple in AK.

Guest Service Recovery is what you're talking about. Taking a negative experience and exceeding expectations to make it positive.

Magical Moments, in corporate parlance, are giving guests a special "unforgettable" memory for no reason. This could be anything from being anything from being the Grand Marshal in the parade, to re-riding an attraction without getting off, special photo ops/character interactions, complimentary Mickey bars if you look like you're having a bad day, etc.
 

larryz

I'm Just A Tourist!
Premium Member
Guest Service Recovery is what you're talking about. Taking a negative experience and exceeding expectations to make it positive.

Magical Moments, in corporate parlance, are giving guests a special "unforgettable" memory for no reason. This could be anything from being anything from being the Grand Marshal in the parade, to re-riding an attraction without getting off, special photo ops/character interactions, complimentary Mickey bars if you look like you're having a bad day, etc.
Well, this was certainly an "unforgettable" memory -- waiting for a couple of hours due to a system crash...
 

danlb_2000

Premium Member
You are correct. Technically, their system should be running in a private cloud so if a computer (real or virtual) fails, another is in place to take over. This solution also handles load balancing during busy times. Now if someone "updated" production software with bad code, that is software without proper testing or data that was incorrect, that could have taken down a cloud system as well as the update should be pushed to the entire cloud. But proper management of the services would have allowed restoration quickly.

Of course, I'm sure they have a very robust solution in place and I don't know what I'm talking about.

Having redundancy is becoming more common, so more failures tend to fall into the second category you describe. Google had a massive failure of it's cloud service recently. The root cause was a incorrect configuration pushed out to servers, but also a number of other failures of things that should have mitigated the impact of the incorrect configuration and help with recovery from the failure. No matter how good your plans are, outages are never completely avoidable, so there should be procedures in place to allow business to continue even if it's in a somewhat degraded fashion.
 

Jedi Stitch

Well-Known Member
Disneyland and WDW work tickets entirely differently. At WDW, there are touch-points and finger scans. The only time they'll take a photo of a guest is if they can't or refuse to use the finger scan. Then a CM with an iPad comes over, scans your admission media and takes a photo. After that, the guest has to request a CM with an iPad to enter every time they want to go to a park. DL uses barcode and photo, no finger scan.

Premiere Passes have both RFID and barcode because they have to be compatible with both systems. A photo taken for entry at DL won't work for no-finger entry at WDW, WDW will take their own.
Thanks Monty, that is good to know. Now, I am confused how the system only let MB in and not hard cards. Huh.
 

Obobru

Well-Known Member
Original Poster
Thanks Monty, that is good to know. Now, I am confused how the system only let MB in and not hard cards. Huh.

MB are managed on a separate system. When you buy a ticket the admission data (number of days, admission rules such as park hopping etc) are transferred to the MB database to match up with other data tied to the MB, I believe this is a cloud-based solution and doesn't need a live check against the database as it can be locally stored and sent later. The other entry media is running from the ticketing software servers (not cloud-based) and needs to check against it in real-time to check it's valid.
 

aaronml

Well-Known Member
MB are managed on a separate system. When you buy a ticket the admission data (number of days, admission rules such as park hopping etc) are transferred to the MB database to match up with other data tied to the MB, I believe this is a cloud-based solution and doesn't need a live check against the database as it can be locally stored and sent later. The other entry media is running from the ticketing software servers (not cloud-based) and needs to check against it in real-time to check it's valid.
Ticketing data isn’t stored in the MagicBand management system database..... It actually is some amount of MagicBand/NGE data (mostly a reference ID) that’s stored in ATS.

MagicBand or not, tickets can’t be validated if ATS is down.

When you scan a MagicBand at a tapstile, an API call is made to the MagicBand system to receive the NGE reference ID for your MDX account. ATS is then checked for tickets associated with your account and if you have a valid one, you are granted admission to the park. FastPass+ and most other NGE systems work in a similar way.

The reason why MagicBands were “working” during this incident is because the tapstiles were in “Auto-Green” mode.
 

aaronml

Well-Known Member
Maybe cards and MBs are in two separate databases. ^^^ what he said...
MBs and RFID Ticket Cards are not in two separate databases. They are both in the same system/database, known as XBMS (XBand Management System).
They literally work the same way from a ticketing/admission standpoint.

If you had an RFID ticket card or KTTW card (yes you can still get these), it would have worked during this incident.
 

BoarderPhreak

Well-Known Member
Having redundancy is becoming more common, so more failures tend to fall into the second category you describe. Google had a massive failure of it's cloud service recently. The root cause was a incorrect configuration pushed out to servers, but also a number of other failures of things that should have mitigated the impact of the incorrect configuration and help with recovery from the failure. No matter how good your plans are, outages are never completely avoidable, so there should be procedures in place to allow business to continue even if it's in a somewhat degraded fashion.
Redundancy is not a new concept; it's been employed for decades. Even the methods aren't all that new; fault tolerance, load-balancing, redundancy, failover. All the things we employ to make sure our stuff stays running (even if degraded, as you point out). The key is to test these scenarios regularly. Issues can always creep in; flaky software, hardware failures and yes - plain old human error. And as infallible as we think things are, there should be manual methods in place wherever possible, as much as possible anyway, in case things really get ugly. This is where a solid runbook helps tremendously.

You can't avoid every eventuality, but with the right infrastructure and procedures you can reduce the impact to near zero. Planning, testing and documentation are key.
 

bUU

Well-Known Member
I love it when a company plant comes to play.
As if the curmudgeons here were worth the bother.

Most of the competent technologists in central Florida are wary of working for Disney since they outsourced most of their IT a few years ago
Just like practically all of the large companies my colleagues and I have considered working for in the last five years.

Now, back to the topic... Has anyone heard what caused the issue?
No. Besides, having to deal with the actual facts of the situation would spoil the curmudgeon's fun.

As an IT Professional, the scary thing about all these poste is not only the truth behind them, but the fact that the complete failure of executives and upper management to make deployments bullet proof due to their ignorance.
The ones I've advised and worked for aren't ignorant. They look at the numbers and play the odds. They understand exactly what risks they're taking and they still take them because they also know the likely damages that they'll incur and realize that they are worth incurring. Even Equifax is pretty close to back to where it was before the breach.
 

bUU

Well-Known Member
So is it naive to believe there should be a backup system, or is it naive to believe they should have procedures for what to do if a system goes down (or both)?
Naive to believe that there isn't a backup system and naive to believe that the procedures when there is a failure of the backup system should include letting people in with no way of tracking them.
 

LSLS

Well-Known Member
Naive to believe that there isn't a backup system and naive to believe that the procedures when there is a failure of the backup system should include letting people in with no way of tracking them.

So you believe there is a backup system that also failed then (I mean, I assume there was, but you never know if they may have a full electronic backup or something else)? I can understand not letting everyone straight in, but is there seriously no better alternative than making everyone wait in line for guest services?
 

danlb_2000

Premium Member
So you believe there is a backup system that also failed then (I mean, I assume there was, but you never know if they may have a full electronic backup or something else)? I can understand not letting everyone straight in, but is there seriously no better alternative than making everyone wait in line for guest services?

If the outage was only a few hours long, I would think the risk of letting people in without verifying their access would be very small. How many people are actually going to show up in that time with invalid media just to exploit the outage? Even if they did, they are likely going to spend money in the park so it's not even a total loss.
 

lazyboy97o

Well-Known Member
If the outage was only a few hours long, I would think the risk of letting people in without verifying their access would be very small. How many people are actually going to show up in that time with invalid media just to exploit the outage? Even if they did, they are likely going to spend money in the park so it's not even a total loss.
They would also have to pay for parking just to try. Definitely cheaper than dealing with all of the people who are upset and accommodating changes to reservations.
 

flynnibus

Premium Member
but to expect that kind of redundancy to keep a bar code reader operating is irrational. SMH

Yes, it's irrational to expect any non-trivial system to be completely fault proof. But what you are single tracking and missing is... what goes beyond the software design. What is your business continuity plan when this system is down? That is not irrational... that is just good business sense.

We don't control the weather - but we do have a plan for when the weather interferes with our operation. It's the same thing here... you have a control point at the gates, what do you do when that control point is non-functional for an extended period of time?

The easiest and most customer friendly solution is to simply let people in. The few people that 'get in free' are inconsequential to the big picture.

Yes, the CMs at the entrance could've let them in anyway, but then you would've had tons of people claiming that they had a FP for an attraction, even when they didn't. Social media spreads news like wildfire nowadays, and we all know how entitled the idiots are now.

Easy solution... don't open the FP queues. Go without FP for the day. If you don't want to goto that extreme, you ask people to show their reservations on their personal devices. Don't have a personal device? Please goto the nearest FP+ kiosk and have a CM look it up and write a one-off FP ticket that could be redeemed at the attraction.

What you do would be driven by how big of an audience is impacted for how long.


The whole point is you should have flexibility to roll when your digital-only systems are down. If not... you really paint yourself in a corner where digital systems now pose a greater risk to your revenues and customer sat. Not a good place to be.
 

TheGuyThatMakesSwords

Well-Known Member
Yes, it's irrational to expect any non-trivial system to be completely fault proof. But what you are single tracking and missing is... what goes beyond the software design. What is your business continuity plan when this system is down? That is not irrational... that is just good business sense.

We don't control the weather - but we do have a plan for when the weather interferes with our operation. It's the same thing here... you have a control point at the gates, what do you do when that control point is non-functional for an extended period of time?

The easiest and most customer friendly solution is to simply let people in. The few people that 'get in free' are inconsequential to the big picture.



Easy solution... don't open the FP queues. Go without FP for the day. If you don't want to goto that extreme, you ask people to show their reservations on their personal devices. Don't have a personal device? Please goto the nearest FP+ kiosk and have a CM look it up and write a one-off FP ticket that could be redeemed at the attraction.

What you do would be driven by how big of an audience is impacted for how long.


The whole point is you should have flexibility to roll when your digital-only systems are down. If not... you really paint yourself in a corner where digital systems now pose a greater risk to your revenues and customer sat. Not a good place to be.

I absolutely LOVE this post.

My favorite? "The easiest and most customer friendly solution is to simply let people in. The few people that 'get in free' are inconsequential to the big picture ". DEAD RIGHT. In the absence of a "revert to paper" plan? DENY on media that everyone is getting in - then let everyone in :).

Think of it this way: "Do you NOT have a valid Ticket? We don't know. Therefore, say "friend", and enter" :). (pedo mellon a minno) :).

A week of this? WDW is going to develop a back-up plan :).
 

Register on WDWMAGIC. This sidebar will go away, and you'll see fewer ads.

Back
Top Bottom