The Infamous Checkin on Sunday, June 16, 2013

Leah.Davenport • June 16, 2013

If you do checkin at your church, you will  know that checkin was crippled on Sunday morning. It was not completely broken but worked about half the time. This has now been repaired.

We are very sorry for the trouble and you will need to go back and revalidate all of your attendance for Sunday. Some attendance worked, others did not.

The following paragraphs explain what happened and why. Warning, this may be TMI for you (too much information), so you can stop here if you like, knowing that the problem is resolved.

If you must know, here is the After Battle Debrief

I have observed that Checkin takes too long between when you enter your phone number and when the list of family members shows up. Although it has not been unbearable, it was beginning to annoy me. Last weekend I put some new logging that shows us the response time for each request in the system. It all looked good except for 1 type of request and that was the phone number lookup and corresponding family list. Those were taking between 2 and 3 seconds to complete. That does not sound like a long time, but in this day of high speed Internet expectations, it was too long.

Friday night and all day Saturday, I decided to look into how I could reduce the response time. This resulted in a conversion and rewrite from C# into pure SQL code. When I put this new code through it's paces, the results were very satisfying. Typically, the response was less than 1 second, quite reasonable. I tested this with our usual set of checkin numbers on our test site. All of it worked very nicely.

I was so pleased with the results and knew you would be too (with the speed) that I took a risk to try it out this morning. I knew I needed a Plan B in case this Plan A went awry and Plan B needed to be a quick switch I could easily make that would switch the entire system back to the old way within one minute. This morning, we got a call from our staff at Bellevue saying that Checkin was working for some but not for others. The first thing I did was to implement Plan B. That took me less than a minute. Thinking that all would be well now, we got another call a few minutes later saying that nothing worked. In, other words my Plan B was a total failure. So knowing that Plan A at least worked partially, I switched it back and started looking for the problem with Plan A. I was thinking that it would be a single problem so that is why I forged ahead and did not proceed with trying to get Plan B to work.

Before I go further, please note the following points about the family list on Checkin:

  • We display members of a checkin enabled organization.
  • There could be several choices for each family member based on multiple memberships or schedules
  • We display recent visitors of a checkin enabled org
  • Again there could be several based on the visit dates and how long each org allows visitors to continue to show up
  • In the absence of any of these, we display remaining family members with a notice that "no self checkin meetings are available"
  • We display only orgs and meetings that are in the specified campus.
  • We disable checkin based on late checkin minutes and early checkin hours and day of week
  • We disable checkin for pending members of an org
  • We support both PC based Checkin and iPad
  • We allow many fields to be edited on a family members record including basic contact details and other things like other church name, parents names etc.
  • There are many other factors but only  and all of these above came into play this Sunday.

So here is the sequence of things that went wrong as best as I can remember. Sorry for any technical terms, hard to avoid.

  1. I found out the name of the couple whose checkin organizations would not show up. They were staff leaders. The problem with them was that the Pending flag on their org member record was not true but not false either. I should have had it consider Pending to be false if it was not explicitly set to true. Once I did that, tested and and thought all was well and this is before 8:00.
     
  2. Then we got a call regarding another couple who could not check in. I investigated and found out that they had a former church name on their record that was too long for the temporary table I needed to put it in. It was crashing before any results were returned. Fixed that 8:30 and thought all was well.
     
  3. About now the "fog of war" was setting in, and I can't remember the exact sequence but here is the best I can come up with.
     
  4. Now we start getting emails from other churches. This required a few back and forth's to get details to reproduce. For churches that do not use a Campus, it was not recognizing their memberships. I found this problem; it was another NULL issue with campusid, needed to recognize a 0 as equivalent to a NULL. Got this fixed and thought all was well.
     
  5. Then I find out that recent visitors are not showing up for checkin. I realize that I am a long way from knowing "all is well" so I give up on that optimistic thought. This time, I thought that maybe campus was the problem here too, but it was not. So it turns out that this problem was how we reset the visitors using a date on which if they visited before, they would stop showing up. This date was null for a number of organizations and worked fine in C# but in SQL nobody showed up. Fixed it...waited for the next bombshell.
     
  6. The next problem was that some orgs do not specify the number of checkin labels. There is supposed to be a reasonable default, but I was not using it. Worked for iPad, not for PC Checkin. Fixed it. 
     
  7. At this point, I had been seeing a lot of errors come through related to recording attendance. The complaint was that there was a NULL meeting date. I noticed that this was only happening to iPads. It is about 11:00 now and I am frazzled wanting this day to be over. 

This last bug was elusive. I could not understand what was happening because I use a contractor to develop our iPad app and don't have a way to debug it like I do the PC checkin. I found out by looking at it on my iPad that for people who were neither members nor recent visitors that even though it said "no checkin meetings available" next to a person's name, it would allow them to check the blue box to check themselves in. And apparently there were a bunch of people doing that. I thought that was a bit odd to have that many visitors. But bottom line, the problem would be fixed if I could prevent them from checking that button. 

I knew there had to be a difference between what was being sent to the iPad from last week to this. So the only way was for me to install an older version on our test machine and capture the output. Sure enough, there was one minor difference. We have an indicator that shows whether a person is a Member or Visitor (M or V) But that indicator does not apply for someone who is neither. I was sending an empty string, but I should have been sending a NULL. The PC version did not care, but the iPad was expecting the NULL, and with the empty string, it was displaying the blue checkin button. This would not have been that big of a problem, but remember that back before step 5 above, all recent visitors would have had this problem since they were not being considered recent visitors. My confusion was there because I was not looking at the problem before step 5 was fixed, but after, when the problem was mitigated because it would then only have happened to new visitors. 

Regardless, I fixed it and at that point it was about Noon and only the smoke of the battle remained. Checkin for Sunday, June 16 was over. Although we took many casualties, we had won the war. Now we just need to bury all the bodies...

Share by: