Tag Archives: ci

Firefox Automation report – Q2 2015

It’s been a while since I wrote my last Firefox automation report, so lets do another one to wrap up everything happened in Q2 this year. As you may remember from my last report the team has been cut down to only myself, and beside that I was away the whole April. Means only 2 months worth of work will be covered this time.

In general it was a tough quarter for me. Working alone and having to maintain all of our infrastructure and keeping both of our tests (Mozmill and Marionette) green was more work as expected. But it still gave me enough time to finish my two deliverables:

  • Ensure better visibility of Firefox UI test results for sheriffs and developers by uploading them to treeherder
  • Finalize features for Firefox UI update tests, and help to get them running on RelEng hardware

Firefox UI Test Results on Treeherder

With the transition of Mozmill tests to Marionette we no longer needed the mozmill-dashboard. Especially not since nearly no job in our Mozmill CI is running Mozmill anymore. Given that we also weren’t happy with the dashboard solution and the usage of couchdb under the hood, I was happy that a great replacement exist and our Marionette tests can make use of.

Treeherder is the new reporting system for tests being run in buildbot continuous integration. Because of that it covers all products and their appropriate supported branches. That’s why it it’s kinda perfect for us.

Before I was able to get started a bit of investigation had to be done, and I also checked some other projects which make use of treeherder reporting. That information was kinda helpful even with lots of undocumented code. Given that our tests are located outside of the development tree in its own Github repository some integration can be handled more loosely compared to in-tree tests. With that respect we do not appear as Tier-1 or Tier-2 but results are listed under the Tier-3 level, which is hidden by default. But that’s fine given that our goal was to bring up reports to treeherder first. Going into details will be a follow-up task.

For reporting results to Treeherder the Python package treeherder-client can be used. It’s a collection of different classes which help with authentication, collecting job details, and finally uploading the data to the server. It’s documentation can be found on readthedocs and meanwhile contains a good number of example code which makes it easier to get your code implemented.

Before you can actually upload reports the names and symbols of the groups and jobs have to be defined. For our tests I have chosen the three group symbols “Fu”, “Ff”, and “Fr”. Each of those stays for “(F)irefox UI Tests” and the first letter of the testrun name. We currently have “updates”, “functional”, and “remote”. As job name the locale of Firefox will be used. That results in an output like the following:

Treeherder Results

Whether jobs are passing or failing it is recommended to always add the log files as generated by running the tests to the report. It will not happen via data URLs but as artifacts with a link to an upload location. In our case we make use of Amazon S3 as storage service.

With all the pieces implemented we can now report to treeherder and covering Nightly, Aurora (Developer Edition), Beta, and Release builds. As of now reporting only happens against the staging instance of Treeherder, but soon we will be able to report to production. If you want to have a sneak peak how it works, just follow this link for Nightly builds.

More details about Treeherder reporting I will do later this quarter when the next pieces have been implemented.

Firefox UI Update Tests under RelEng

My second deliverable was to assist Armen Zambrano in getting the Firefox UI Update tests run for beta and release builds on Release Engineering infrastructure. This was a kinda important goal for us given that until now it was a manually triggered process with lots of human errors on our own infrastructure. That means lots of failures if you do not correctly setup the configuration for the tests, and a slower processing of builds due to our limited available infrastructure. So moving this out of our area made total sense.

Given that Armen had already done a fair amount of work when I came back from my PTO, I majorly fixed issues for the tests and the libraries as pointed out by him. All that allowed us to finally run our tests on Release Engineering infrastructure even with a couple of failures at the beginning for the first beta. But those were smaller issues and got fixed quickly. Since then we seem to have good results. If you want to have a look in how that works, you should check the Marionette update tests wiki page.

Sadly some of the requirements haven’t been completely finished yet. So the Quality Engineering team cannot stop running the tests themselves. But that will happen once bug 1182796 has been fixed and deployed to production.

Oh, and if you wonder where the results are located… Those are not getting sent to Treeherder but to an internal mailing list as used for every other automation results.

Other Work

Beside the deliverables I got some more work done. Mainly for the firefox-ui-tests and mozmill-ci.

While the test coverage has not really been increased, I had a couple of regressions to fix as caused by changes in Firefox. But we also landed some new features thankfully as contributed by community members. Once all that was done and we agreed to have kinda stable tests, new branches have been created in the repository. That was necessary to add support for each supported version of Firefox down to ESR 38.0, and to be able to run the tests in our Mozmill CI via Jenkins. More about that you will find below. The only task I still haven’t had time for yet was the creation of proper documentation about our tests. I hope that I will find the time in Q3.

Mozmill CI got the most changes in Q2 compared to all the former quarters. This is related to the transition from Mozmill tests to Marionette tests. More details why we got rid of Mozmill tests can be found in this post. With that we decided to get rid of most of the tests and mainly start from scratch by only porting the security and update tests over to Marionette. The complete replacement in Mozmill and all its jobs can be seen on issue 576 on Github. In detail we have the following major changes:

  • Run all jobs with Marionette beside Firefox ESR 31.0 which is not supported by Marionette, and ondemand_update jobs because they still have to be run by Quality Engineering.
  • Reduced number of platforms. We got rid of Windows Vista, Ubuntu 14.10, and OS X 10.7 whereby the latter machines have been re-used for OS X 10.10.
  • No usage of a pre-configured environments anymore, but creating it from fresh for each test-run by installing Python packages from the internal PyPI mirror.
  • Sending test results to treeherder and giving public access for everyone.
  • Stopped sending emails for failures to our mozmill-ci mailing list in favor of having treeherder results.

All changes in Mozmill CI can be seen on Github.

Last but not least we also had two releases of mozdownload in Q2. Both had a good amount of features included. For details you can check the changelog.

I hope that gave you a good quick read on the stuff I was working on last quarter. Maybe in Q3 I will find the time to blog more often and in more detail. Lets see.

Firefox Automation report – week 51/52 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 51 and 52 of 2014. I’m sorry for this very late post but changes to our team, which I will get to in my next upcoming post, caught me up with lots of more work and didn’t give me the time for writing status reports.

Highlights

Henrik started work towards a Mozmill 2.1 release. Therefore he had to upgrade a couple of mozbase packages first to get latest Mozmill code on master working again. Once done the patch for handling parent sections in manifest files finally landed, which was originally written by Andrei Eftimie and was sitting around for a while. That addition allows us to use mozhttpd for serving test data via a local HTTP server. Last but not least another important feature went in, which let us better handle application disconnects. There are still some more bugs to fix before we can actually release version 2.1 of Mozmill.

Given that we only have the capacity to fix the most important issues for the Mozmill test framework, Henrik started to mass close existing bugs for Mozmill. So only a hand-full of bugs will remain open. If there is something important you want to see fixed, we would encourage you to start working on the appropriate bug.

For Mozmill CI we got the new Ubuntu 14.10 boxes up and running in our staging environment. Once we can be sure they are stable enough, they will also be enabled in production.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 51 and week 52.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meeting of week 51 and week 52.

Firefox Automation report – week 47/48 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 47 and 48.

Highlights

Most of the work during those two weeks made by myself were related to get [Jenkins](http://jenkins-ci.org/ upgraded on our Mozmill CI systems to the most recent LTS version 1.580.1. This was a somewhat critical task given the huge number of issue as mentioned in my last Firefox Automation report. On November 17th we were finally able to get all the code changes landed on our production machine after testing it for a couple of days on staging.

The upgrade was not that easy given that lots of code had to be touched, and the new LTS release still showed some weird behavior when connecting slave nodes via JLNP. As result we had to stop using this connection method in favor of the plain java command. This change was actually not that bad because it’s better to automate and doesn’t bring up the connection warning dialog.

Surprisingly the huge HTTP session usage as reported by the Monitoring plugin was a problem introduced by this plugin itself. So a simple upgrade to the latest plugin version solved this problem, and we will no longer get an additional HTTP connection whenever a slave node connects and which never was released. Once we had a total freeze of the machine because of that.

Another helpful improvement in Jenkins was the fix for a JUnit plugin bug, which caused concurrent builds to hang, until the former build in the queue has been finished. This added a large pile of waiting time to our Mozmill test jobs, which was very annoying for QA’s release testing work – especially for the update tests. Since this upgrade the problem is gone and we can process builds a lot faster.

Beside the upgrade work, I also noticed that one of the Jenkins plugins in use, it’s actually the XShell plugin, failed to correctly kill the running application on the slave machine in case of an job is getting aborted. The result of that is that following tests will fail on that machine until the not killed job has been finished. I filed a Jenkins bug and did a temporary backout of the offending change in that plugin.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 47 and week 48.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 47 and week 48.

Firefox Automation report – week 45/46 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 45 and 46.

Highlights

In our Mozmill-CI environment we had a couple of frozen Windows machines, which were running with 100% CPU load and 0MB of memory used. Those values came from the vSphere client, and didn’t give us that much information. Henrik checked the affected machines after a reboot, and none of them had any suspicious entries in the event viewer either. But he noticed that most of our VMs were running a very outdated version of the VMware tools. So he upgraded all of them, and activated the automatic install during a reboot. Since then the problem is gone. If you see something similar for your virtual machines, make sure to check that used version!

Further work has been done for Mozmill CI. So were finally able to get rid of all the traces for Firefox 24.0ESR since it is no longer supported. Further we also setup our new Ubuntu 14.04 (LTS) machines in staging and production, which will soon replace the old Ubuntu 12.04 (LTS) machines. A list of the changes can be found here.

Beside all that Henrik has started to work on the next Jenkins v1.580.1 (LTS) version bump for the new and more stable release of Jenkins. Lots of work might be necessary here.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 45 and week 46.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 45 and week 46.

Firefox Automation report – week 43/44 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 43 and 44.

Highlights

In preparation for the QA-wide demonstration of Mozmill-CI, Henrik reorganized our documentation to allow everyone a simple local setup of the tool. Along that we did the remaining deployment of latest code to our production instance.

Henrik also worked on the upgrade of Jenkins to latest LTS version 1.565.3, and we were able to push this upgrade to our staging instance for observation. Further he got the Pulse Guardian support implemented.

Mozmill 2.0.9 and Mozmill-Automation 2.0.9 have been released, and if you are curious what is included you want to check this post.

One of our major goals over the next 2 quarters is to replace Mozmill as test framework for our functional tests for Firefox with Marionette. Together with the A-Team Henrik got started on the initial work, which is currently covered in the firefox-greenlight-tests repository. More to come later…

Beside all that work we have to say good bye to one of our SoftVision team members.October the 29th was the last day for Daniel on the project. So thank’s for all your work!

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 43 and week 44.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 43 and week 44.

Firefox Automation report – week 41/42 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 41 and 42.

With the beginning of October we also have some minor changes in responsibilities of tasks. While our team members from SoftVision mainly care about any kind of Mozmill tests related requests and related CI failures, Henrik is doing all the rest including the framework and the maintenance of Mozmill CI.

Highlights

With the support for all locales testing in Mozmill-CI for any Firefox beta and final release, Andreea finished her blacklist patch. With that we can easily mark locales not to be tested, and get rid of the long white-list entries.

We spun up our first OS X 10.10 machine in our staging environment of Mozmill CI for testing the new OS version. We hit a couple of issues, especially some incompatibilities with mozrunner, which need to be fixed first before we can get started in running our tests on 10.10.

In the second week of October Teodor Druta joined the Softvision team, and he will assist all the others with working on Mozmill tests.

But we also had to fight a lot with Flash crashes on our testing machines. So we have seen about 23 crashes on Windows machines per day. And that all with the regular release version of Flash, which we re-installed because of a crash we have seen before was fixed. But the healthy period did resist long, and we had to revert back to the debug version without the protect mode. Lets see for how long we have to keep the debug version active.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 41 and week 42.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 41 and week 42.

Firefox Automation report – week 39/40 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 39 and 40.

Highlights

One of our goals for last quarter was to get locale testing enabled in Mozmill-CI for each and every supported locale of Firefox beta and release builds. So Cosmin investigated the timing and other possible side-effects, which could happen when you test about 90 locales across all platforms! The biggest change we had to do was for the retention policy of logs from executed builds due to disk space issues. Here we not only delete the logs after a maximum amount of builds, but also after 3 full days now. That gives us enough time for investigation of test failures. Once that was done we were able to enable the remaining 60 locales. For details of all the changes necessary, you can have a look at the mozmill-ci pushlog.

During those two weeks Henrik spent his time on finalizing the Mozmill update tests to support the new signed builds on OS X. Once that was done he also released the new mozmill-automation 2.0.8.1 package.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 39 and week 40.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 39 and week 40.

Firefox Automation report – week 33/34 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 33 and 34.

Highlights

To make sure that our weekly meetings will be more visible to our community, we got them added to the community calendar. If you are interested in what’s going on for Firefox Automation you are welcome to join our Monday’s team meeting.

In regards of the Mozmill project, Henrik landed his patch, which makes Mozmill more descriptive in terms of unexpected application shutdowns. Especially in the past weeks we have seen that Firefox does not restart as expected, but simply quits. There is bug 1057246 filed for the underlying problem. So with the patch landed, Mozmill will log that correctly in the results. Beside that we can also better see when crashes or a not by Mozmill triggered quit happens.

For Mozmill CI we landed a couple of enhancements and fixes. The most important ones were indeed the addition of 20 new locales for testing beta and release builds of Firefox across supported platforms. That means we cover 30 of about 95 active locales now. To cover them all, a good amount of follow-up work is still necessary. Immediately we stopped to run add-on tests for all branches except Nightly builds to save more time on our machines.

Henrik also continued on PuppetAgain integration for our staging and production CI systems. One of the blockers was the missing proxy support, but with the landing of the patch on bug 1050268 all proxy related work should have been done now.

Also on the continuous integration for TPS tests we made progress. The implementation got that far for Coversheet that we made the Jenkins branch the active master. There are still issues to implement or get fixed before the Jenkins driven CI can replace the old hand-made one.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 33 and week 34.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 33 and week 34.

Firefox Automation report – week 31/32 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 31 and 32. It’s a bit lesser than usual, mainly because many of us were on vacation.

Highlights

The biggest improvement as came in during week 32 were the fixes for the TPS tests. Cosmin spent a bit of time on investigating the remaining underlying issues, and got them fixed. Since then we have a constant green testrun, which is fantastic.

While development for the new TPS continuous integration system continued, we were blocked for a couple of days by the outage of restmail.net due to a domain move. After the DNS entries got fixed, everything was working fine again for Jenkins and Mozilla Pulse based TPS tests.

For Mozmill CI we agreed on that the Endurance tests we run across all branches are not that useful, but only take a lot of time to execute – about 2h per testrun! The most impact also regarding of new features landed will be for Nightly. So Henrik came up with a patch to only let those tests run for en-US Nightly builds.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 31 and week 32.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 32. There was no meeting in week 31.

Firefox Automation report – week 29/30 2014

In this post you can find an overview about the work happened in the Firefox Automation team during week 29 and 30.

Highlights

During week 29 it was time again to merge the mozmill-tests branches to support the upcoming release of Firefox 31.0. All necessary work has been handled on bug 1036881, which also included the creation of the new esr31 branch. Accordingly we also had to update our mozmill-ci system, and got the support landed on production.

The RelEng team asked us if we could help in setting up Mozmill update tests for testing the new update server aka Balrog. Henrik investigated the necessary tasks, and implemented the override-update-url feature in our tests and the mozmill-automation update script. Finally he was able to release mozmill-automation 2.6.0.2 two hours before heading out for 2 weeks of vacation. That means Mozmill CI could be used to test updates for the new update server.

Individual Updates

For more granular updates of each individual team member please visit our weekly team etherpad for week 29 and week 30.

Meeting Details

If you are interested in further details and discussions you might also want to have a look at the meeting agenda, the video recording, and notes from the Firefox Automation meetings of week 29 and week 30.