Pages: 1 2 »
  Print  
Author Topic: Automated integration and regression extension and library  (Read 3967 times)
Offline (Unknown gender) forthevin
Posted on: July 25, 2013, 07:49:28 AM

Contributor
Joined: Jun 2012
Posts: 171

View Profile
Hi everyone. I am happy to say that I have created an extension and library for doing automated integration and regression testing. The extension has been committed and is called IntegrationTesting, and the library has been pushed to a branch called integration_testing_develop: https://github.com/enigma-dev/enigma-dev/tree/integration_testing_develop. The test system has been designed such that tests can be run and output results by developers as well as automatically.

In regards to running it automatically, I think it would be a good idea if we also looked into using a Continuous Integration (CI) system. A CI system does a number of different things, but is basically about continuously building a project, running its tests, generating test reports, etc. automatically. For instance, you could make a CI system that builds a given project every time a commit is made to the master branch of the main repository, run all tests with the newest version, generate test reports, and report if there have been any regressions since the previous commit. This makes it really easy to track down when exactly a bug covered by a test was introduced.
There exists a number of existing CI systems. One open-source CI system which seems to be widely used is Jenkins, which is used by Apache, Mozilla and others. I have added an output format for the generated testreport accepted by Jenkins' default JUnit-plugin for parsing test results. It's a bit hacky, since the testing system I have implemented is not based on JUnit, but I believe it should work well enough. If we need another test report output, I can write one for that as well.

I have uploaded the test report and an output version generated from the current tests such that you can see a bit of what it looks like:
Example of the general test report: http://pastebin.com/2756rkG5
Example of a HTML output version: https://www.dropbox.com/s/ycqp8ck13na1e4g/testreport.html
The HTML-version is somewhat simple, but should give a good overview over things.

As for the testing system itself, I decided going with integration testing instead of unit testing. Unit testing is basically testing each small part independently, while integration testing is about testing all or parts of the whole integrated system at once (such as running a game). Given that it is easy to create and run games, and that most of the APIs do not change a lot over time, I decided to make a test the same as a game. In effect, this means that each added test is another game that we test for compiles, runs and obeys some properties (for instance, for a graphical shapes test, if you draw a filled red rectangle, then the color in the center of the rectangle is going to be red).
Since integration tests relies first and foremost on the interfaces and APIs and not on the specific implementation, and that most of our interfaces and APIs do not change a lot over time, this means that the majority of the tests we write now will not need any changes or maintenance in the future. It also means that most tests will be valid for different implementations of the same interface, meaning that if a test is written for the graphics system OpenGL1, it will also be valid for OpenGL3 and DirectX.

The current extension and library should be ready for use, though I would like to get your general approval and comments of the system first. This includes the current feature set, the dependencies, what features ought to be available, etc.

In regards to features, the best links to get an impression of them is the following:
https://github.com/enigma-dev/enigma-dev/blob/integration_testing_develop/integration_tests/README.md
https://github.com/enigma-dev/enigma-dev/blob/integration_testing_develop/integration_tests/format/test_file_format
https://github.com/enigma-dev/enigma-dev/blob/integration_testing_develop/ENIGMAsystem/SHELL/Universal_System/Extensions/IntegrationTesting/integration_testing.h

In regards to dependencies, the testing system currently depends on Python 3.3. The main reason for choosing Python is partially that we already use Python (install.py), and partially that it is a cross-platform and expressive scripting language. The reason for using version 3.3 is a couple of features, including a timeout functionality when running commands outside Python, such as when running a test. I am a bit in doubt whether it is a good idea to rely on Python 3.3 given that it is relatively new, but if relying on Python 3.3 is not a good idea, it shouldn't be too bothersome to rewrite the libraries to be compatible with an older version of Python.

In regards to the near future assuming no major changes to the testing system, I will look at adding desired features to the testing system, fixing bugs in the testing system, writing tests and fixing issues. I will also look at spending time on writing and updating documentation, including on the wiki.

A final note is that running the system now (using python3 integration_tests/run_and_record_tests.py on Ubuntu 13.04) will not give very interesting results, since the plugin needs to be built and uploaded to DropBox (the test system requires the plugin to have some more features, which I have committed to the master branch).
Logged
Offline (Male) Josh @ Dreamland
Reply #1 Posted on: July 25, 2013, 09:30:13 AM

Prince of all Goldfish
Developer
Location: Pittsburgh, PA, USA
Joined: Feb 2008
Posts: 2959

View Profile Email
Extra awesome. ENIGMA has needed this for a long time, but I never got around to doing it; instead I ended up joining the others in impatiently pressing forward with whatever we had.

I will give this a look over next I'm free. Whenever that is.

Thanks a ton for the work!
Logged
"That is the single most cryptic piece of code I have ever seen." -Master PobbleWobble
"I disapprove of what you say, but I will defend to the death your right to say it." -Evelyn Beatrice Hall, Friends of Voltaire
Offline (Unknown gender) forthevin
Reply #2 Posted on: July 27, 2013, 05:58:20 AM

Contributor
Joined: Jun 2012
Posts: 171

View Profile
Thanks, I am happy to help :).

Josh, that reminds me of another thing I have thought about: I think the automated regression testing would work really well together with the idea you previously proposed about having multiple branches here. By doing that, development could go like this: Developing stuff, commit the stuff to the testing branch, wait for the tests on the testing branch to finish running on multiple platforms by the continuous integration system, check that there are no regressions, and merge the changes into master. That way, it would require almost no work to run all tests on multiple platforms before committing to master. What do you think?
Logged
Offline (Male) Josh @ Dreamland
Reply #3 Posted on: July 27, 2013, 06:31:34 AM

Prince of all Goldfish
Developer
Location: Pittsburgh, PA, USA
Joined: Feb 2008
Posts: 2959

View Profile Email
Agreed.

While this would ideally be done as a complement rather than supplement to manual testing, until the regression testing suite is more complete, I think it's in our best interest to set something like this up as soon as we are prepared for branching. My only major concern is how it will handle things like antialiasing and rounding issues (for arithmetic tests). For example, the test suite might expect a given pixel to be any color between pure blue and pure white, while in fact it ended up being one or the other due to minute hardware differences. Likewise, the suite may expect .1 + .1 + .1 to be roughly .3, while in fact the value differs with the phase of the moon. On a happier note, I have high hopes for its performance with string, data structure, and I/O operations, as those tend to have only one right answer (not to imply that .30000000004 is a right answer to .1 + .1 + .1).

That said, if the system seems to do well in these cases, though, I'm fine entrusting it to the complete task. Especially considering one of the worst (and, incidentally, most frequent) regressions is a complete lack of ability to compile with ENIGMA on one platform or another.

It would be particularly useful if we could get cross-compilation working as part of this system, too, as contention between the tastes and preferences of MinGW on Linux vs Windows have led, on a thousand separate occasions, to the dysfunction of one system or the other. Reports are rolling in now that OpenAL doesn't work at all on Windows, and the simple explanation for this is that cheeseboy couldn't get AL working on MinGW for Linux, so he packaged his half-assed SFML system in his installer without paying any attention to AL. Then polygone and Robert decided, "WOW, A WINDOWS INSTALLER THAT ACTUALLY WORKS," buried my zip patch elsewhere on the Wiki, and put his up as the only installation candidate. Also, the bundled SFML doesn't work on Windows. Look where that leaves us.

The point is, as far as making sure the system actually compiles on each platform, this automated testing is invaluable. Pitted against our current system of "commit everything to master and wait for the flood of bug reports," I think the winner is clear.
Logged
"That is the single most cryptic piece of code I have ever seen." -Master PobbleWobble
"I disapprove of what you say, but I will defend to the death your right to say it." -Evelyn Beatrice Hall, Friends of Voltaire
Offline (Unknown gender) TheExDeus
Reply #4 Posted on: July 28, 2013, 05:34:40 AM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
Quote
My only major concern is how it will handle things like antialiasing and rounding issues (for arithmetic tests). For example, the test suite might expect a given pixel to be any color between pure blue and pure white, while in fact it ended up being one or the other due to minute hardware differences.
From my reaserch on this topic I think the only non pixel perfect things OpenGL standart allows is line drawing. Drawing polygons and other stuff should be identical. There could be an issue with 1 pixel offsets here and there, but I think they can be fixed (as well as line drawing) by adding half a pixel to all line drawing functions and to projection functions. That's because the biggest difference between Nvidia and Ati is that one rounds down and the other up.

Quote
Reports are rolling in now that OpenAL doesn't work at all on Windows
Yes, I have actually noticed that for almost 6 months now. I don't really use it so it didn't hurt me that much. Hadn't had the time to fix it either.

And yes, none of the windows installers actually work except Josh'es, but it's old as well and so needs additional fiddling.
Logged
Offline (Unknown gender) forthevin
Reply #5 Posted on: July 28, 2013, 09:31:01 AM

Contributor
Joined: Jun 2012
Posts: 171

View Profile
In regards to the issues with approximate results, I think using wide bounds checking for most cases instead of checking for precise values will be a decent solution. While the test will not detect small deviations, using wide bounds should avoid any false positives, and we still get a lot of the advantages of testing it, including if it compiles. For those cases where we can figure out how to do precise checking without any false positives, we of course ought to do that. I agree that things like data structures and strings should work very well.

It would also be very nice if we could get the installer for Windows to be automatically created as part of the continuous integration. For instance, we could set up a job that creates a new installer once a week/when we request it, with the latest commit which built successfully.

On a side-note, the testing system depends on the latest version of the plugin, which has not been built and uploaded yet. As far as I can tell, I cannot do that by myself at the moment, unless I modify the install.py to load package information from another place. So, I would like to kindly request that it is built and uploaded :).
Logged
Offline (Unknown gender) TheExDeus
Reply #6 Posted on: August 03, 2013, 04:03:05 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
I think we really need to put this into practice. Week ago Robert actually reverted the surface changes I made and then other bad stuff happened as well. So this automatic - at least once a day - compile and test could be useful.
Logged
Offline (Male) Goombert
Reply #7 Posted on: August 03, 2013, 06:52:13 PM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3110

View Profile
Quote
Week ago Robert actually reverted the surface changes
What surface changes? A week ago I tried to add a multisampler surface.
Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Unknown gender) TheExDeus
Reply #8 Posted on: August 04, 2013, 06:58:17 AM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
When you changed float to gs_scalar you actually reverted the ARB change (so people on intel cards can use it) and .png saving among other things.
Logged
Offline (Unknown gender) forthevin
Reply #9 Posted on: August 04, 2013, 04:14:46 PM

Contributor
Joined: Jun 2012
Posts: 171

View Profile
Yes, having it up and running would be very nice. While I can run the tests locally, others can not, and they aren't run automatically or tracked by a continuous integration system. That said, once I have built the plugin from source and replace the current version of the plugin, it will be really easy for others to run the tests: Simply switch branches to integration_tests, and run the command python3 integration_tests/run_and_record_tests.py. Then open the file integration_tests/output_temporary/testreport.html and look through the tests to see what succeeded and what failed. There are currently 8 tests of varying scope and size, including tests I created for bugs I have fixed, as well as bugs that have not yet been fixed.
Logged
Offline (Unknown gender) TheExDeus
Reply #10 Posted on: August 04, 2013, 05:49:23 PM

Developer
Joined: Apr 2008
Posts: 1872

View Profile
So how exactly will the testing be done? I guess we don't commit our changes to enigma-dev without testing (as that is the whole point), so we git get the testing branch (integration_testing_develop) and push our code there, then if tests are okay we push to enigma-dev? Or do we just get the test branch and add our changes to it locally, then test and if okay then commit to enigma-dev without ever committing to testing branch? I would also love for the tests to be run on enigma-dev automatically once in a while and then results uploaded to this site somewhere.
Logged
Offline (Male) Goombert
Reply #11 Posted on: August 04, 2013, 07:29:57 PM

Developer
Location: Cappuccino, CA
Joined: Jan 2013
Posts: 3110

View Profile
Quote
When you changed float to gs_scalar you actually reverted the ARB change (so people on intel cards can use it) and .png saving among other things.
That was what the merge conflict was about now that I remember, but if I recall Harri, I went to github and copied the function over from the main repository exactly as it you committed it. If it is not the same, then please commit it again.

Quote
once I have built the plugin from source and replace the current version of the plugin,
Yes, I am working on getting the LGM with the classes uploaded for you forthevin, I have been having problems with my internet for like 3 months now, and Verizon refuses to fix it. I get upwards of 40% packet loss, so uploading always times out for me and stuff, I will let you know when I have a successful attempt at uploading it.

Logged
I think it was Leonardo da Vinci who once said something along the lines of "If you build the robots, they will make games." or something to that effect.

Offline (Unknown gender) forthevin
Reply #12 Posted on: August 05, 2013, 03:25:56 AM

Contributor
Joined: Jun 2012
Posts: 171

View Profile
TheExDeus: Running the tests manually is just a temporary option until we get them automated. Since it is somewhat bothersome to run the tests every time, I don't personally expect anyone to run them manually as well (though they would be welcome to).
In regards to the end-goal, the idea is indeed to have them fully automated. One way to do that would be to have multiple development branches like Josh previously proposed; developers commit new changes to the testing branch, the tests are run automatically for each tests and the results reported online, and if it looks good, the changes are then ported to the master branch (where building and testing is likewise done).
The nice thing about this is that automated building + testing is relatively easy to set up with a continuous integration system. I looked into using Jenkins for it, and it seems that it is really easy to get Jenkins to automatically detect whenever any changes have happened to a branch in a GitHub repository, and then build + run all tests and report the results (just install the GitHub plugin for it and configure it in the dashboard). Some of the nice things about the automated testing in Jenkins is that it shows an overview over tests, how many commits it is since a failed test first failed, and graphs over the count of failed/passed tests from build to build. Here is the Jenkins dashboard for one of Apache's Hadoop build tasks: https://builds.apache.org/job/Hadoop-branch-1/lastCompletedBuild/testReport/.

Robert: That sucks, I hope it gets sorted out soon. I might be able to build it some other way, don't worry about it too much.
Logged
Offline (Male) Josh @ Dreamland
Reply #13 Posted on: August 06, 2013, 11:26:40 AM

Prince of all Goldfish
Developer
Location: Pittsburgh, PA, USA
Joined: Feb 2008
Posts: 2959

View Profile Email
I'm unsure to what extent my participation is required in automating this, as I don't know what services GitHub offers or what is otherwise available to us. We currently use a GitHub service hook to update the ticker at the top of this site (as well as the IRC bot's database). If you need a server, let me know. If I can't put it up on this one, I'll host it on one of my home computers.
Logged
"That is the single most cryptic piece of code I have ever seen." -Master PobbleWobble
"I disapprove of what you say, but I will defend to the death your right to say it." -Evelyn Beatrice Hall, Friends of Voltaire
Offline (Unknown gender) forthevin
Reply #14 Posted on: August 08, 2013, 06:13:11 PM

Contributor
Joined: Jun 2012
Posts: 171

View Profile
I am currently fixing a couple of issues, but the integration testing branch should be ready for merging into master. I have tested that a test report can be generated on both Linux and Windows. I have also created a new version of the plugin based on the current plugin source, which contains the changes needed to run the tests. It can be found here: https://www.dropbox.com/s/hoxxljr9zfnwaxv/enigma.jar. Note that it has been built for Java 1.6, and that it can only be used together with the latest LGM1.8 by Robert.

Robert: I made it work by using the LGM1.8 jar instead of the repository's source for the LGM library.

Josh: Yes, a server is needed for running Jenkins (http://jenkins-ci.org/) on. I don't currently have a server set up myself, so if you are interested in handling that aspect, it would be very nice. My (somewhat limited) experience is that once Jenkins itself has been set up, most things can be handled through its GUI, including installing plugins for it and setting the plugins up. It has a Git plugin and a GitHub plugin (instructions for getting it to pull automatically from GitHub when a new commit has been made can be found here: https://wiki.jenkins-ci.org/display/JENKINS/GitHub+Plugin), and it comes built-in with a system for handling test results.

In regards to setting it up with the integration tests, the tests should be run as the last step of building (using the command python3 integration_tests/run_and_record_tests.py). To publish the test results, a post-build action must be used to retrieve the test results. While it expects the results to be in a JUnit test result report format, the current testing system supports that format, and reports in that format are automatically generated in "integration_tests/output_temporary/testreport_jenkins.xml" whenever tests are run.
Logged
Pages: 1 2 »
  Print