The elusive bug

(Note: This is something that happened earlier this year. A recent discussion reminded me that I should write about it and here it is)

Symptoms

The bug description wasn’t very helpful. The users couldn’t scroll on their Macs on this one website. Only on a few Macs, not all of them (for one, no one could reproduce this locally). Other operating systems (including iOS) worked fine for them. Other websites scrolled just fine. The browser they were using didn’t matter – Firefox, Chrome, and Safari – if one browser didn’t work, others had the same issue. The reports had been pouring in for almost a year from the in-page “Contact us” button on the website but without being able to replicate the issue, it was impossible to fix. It started off with a single user complaining about not able to scroll and by the time it was fixed a report was pouring in almost every week about how another user wasn’t able to scroll.

I like unsolvable mysteries and I was very interested in this one. I took it up and this is how it turned out (the whole thing happened over the course of a week).

Probable causes

There wasn’t much to go on so I started off with two possible reasons:

The OS

I didn’t expect that the OS could be at fault, but when I searched for scroll issues on mac I came up with two bugs – reported on Chromium and Firefox. It was an Apple bug and related to scrolling so it fit the problem description. The part that didn’t fit was that the issue was triggered by a Mac scroll gesture but maybe we had hit upon another related bug. While I was certain this wasn’t like one of those Internet Explorer’s CSS bugs, there could be a CSS property which relies on the OS which in turn was rendered wrong by the OS rendering engine. If the OS was actually at fault here (according to my notes it seemed unlikely to me at that time because it would definitely be more widespread) there wasn’t much to do except try to reproduce it.

Browser plugin

A badly written browser plugin could just be the culprit. The bug reporting tool on the website included the list of plugins installed and the ones installed on every bug report were these two – Shockwave Flash and WebKit built-in PDF. Not helpful, because these are default plugins and always installed. The next candidates in the list were installed on 60% of the browsers – Silverlight Plug-In and iPhotoPhotocast – both of them again commonly installed plugins and probably were not at fault.

The debugging process

I realized if I was to figure this out, I’d have to pinpoint the source of the problem by trying out different things and ask users to try it out. I started with a basic page template and created test pages by truncating things. There were only a handful of people who reported the issue and were willing to help us fix it, so I had to be careful with my tests to not overwhelm them. Some of the test pages I created were:

  1. With third party JS removed
  2. Non-minified CSS
  3. Page with no CSS

Not much progress here since all tests came back negative. After another iteration of test pages, I was somewhat certain that this was related to a software at the user’s end. Luckily, an enterprising user tried out the scenario for us and disabled browser extensions until the issue went away.

The extension was installed by a video downloading software, which installed extensions to integrate into all available browsers. Finally, the issue could be reproduced! But how could a browser extension mess up the scrolling on a website?

I should note at this point that I don’t use or own a Mac. If I had to figure this out, I’d have to get my hands on one. I tried out Browserstack, but the connection was so slow I couldn’t get any thing done. It’d be great they just hand out VNC details I can connect to myself and control the quality at my end. I didn’t want to let this go when I was so close and had no choice but to install a Mac OS VM.

I installed Firefox and the video downloader on my VM and copied the browser extension directory back to my machine. I had some experience writing a Firefox extension, so I knew where to start looking. The extension loads its own CSS and JS on every page. The CSS was the first thing I opened and right at the very top was this CSS block:

.clear { /* generic container (i.e. div) for floating buttons */
    overflow: hidden !important;
    width: 100% !important;
}

A generic name (even they realized that) in the stylesheet, and it was injected on every page. I recognized this class was used in our code – to clear floats. The extension’s “overflow: hidden” would come in and mess up scrolling. A small change to the rule name confirmed this was what needed to be fixed.

The Fix

The real fun was about to begin. A grep of the code base told me that I had 80 files (this is legacy code, not well written by today’s standards) to sift through to rename this CSS class. Not a lot, but it’d take me a few hours at the very least. And it wasn’t something that I could optimize with a perl pie. Resetting the CSS class with Javascript was difficult because the included stylesheet also set a width on the class and I couldn’t find any way to unset that rule. The naive, hopeful me thought I could just loop through document.styleSheets in JS and remove the conflicting CSS as a temporary fix until the maintainers fixed this issue at their end. Turns out, for security reasons, browsers don’t like it when you do this. In the end, I had go through each file and rename the CSS class.

Epilogue

I sent an email to the maintainers of the video downloader letting them know that their browser extension wasn’t playing nice. They got back to me and said they were going to fix the issue but I haven’t gone back and checked if they actually did. I hope they did stop breaking the internet.

Posted in bugs, computers, tech Tagged with: , , , ,

Leave a Reply