A Distutils Regression Test System ?
by Tarek Ziadé
I am making some progress in Distutils. I closed something like 10 bugs last week, and I am reaching issues that were added 8 months ago. Not that everything is entirely cleaned up in the newest issues, but they’re almost all being processed. Every commit comes with at least a test, to get the code base back into a state were it is easier to make things evolve without the risk of breaking it up.
It comes through tiny little changes, with tests and an eye on the coverage.
Now I am facing an unpleasant situation : since the test coverage is still low, I am always scared of breaking something in Distutils when I am fixing a bug or making a change.Buildbots are watching, and I run some of my own packaging work with the current trunk.
But still, this is an unpleasant situation, and I don’t want to cause the package to be broken in the next Python version…
But the regression tests exists ! They are there, hidden, in the community. It’s everyone package.
- Joe adds an issue in the Python bug tracker, because Distutils didn’t work as expected on his package because of a bug
- At some point the bug is (was) fixed.
- The test to make sure the bug is fixed is “Joe is running Distutils over his package again, and makes sure it is properly installed, compiled, etc”.
- The bug is closed.
So how can I get back this test to make sure Joe’s package is still working properly, so he doesn’t hate us at the next major Python release ?
A Distutils Regression Test Server
If Joe’s package is on PyPI, we can set something up. A dedicated server that watches the PyPI changelog and triggers a buildbot when:
- a new release of Joe’s Package comes out
- we change something in Distutils code
The precise test to be run is still unclear to me but, I am thinking about some generic strategies and I think it’s possible. Let’s call this test a distutils regression test. (If you have a better name, I’ll buy it)
Of course it doesn’t have to be on all the packages that are uploaded out there at PyPI. Just Joe’s one, because he came up with a problem we fixed. And we would be ashamed if the bug comes back on Joe’s package.
This requires of course a server, and probably a vmware-like system if Joe runs Windows or Solaris, to make buildbot slaves etc. It also requires that Joe uses the right metadata in his package so we know if it works under Python 2, Python 3, etc. MvL added enough classifiers lately for this.
A Distributed Distutils Regression Test
But some package are not on PyPI, for privacy or conveniency in the packaging process of the person in charge. So, what if the distutils regression test is provided in a Distutils command ? It can run the same test the server runs, and come up with a report that is sendable or sent by mail to a special mailing list or so.
This supposes that the developer is cooperative. So maybe it can even be automatically triggered in case of any failure on any Distutils command, and ask the user if he would like to send a report ?
The good thing here is that it doesn’t require CPU power on the test server, and that anyone can run that test.
So what ?
Well I am just throwing an idea here, because I am really concerned about the potential regression problems. Even if Distutils is 100% covered with tests, it’s not possible to test all combinations. The real world environment is the only test that can be trusted at the end in the packaging area.
I’ll throw this idea at the Language Summit in March, and if it catches people interest, maybe a Google Summer of Code task could be done for that topic ? Can’t implement it myself, I am overwhelmed already in Distutils maintenance 😀
Just out of curiosity, how do *you* test your packages to make sure they get installed correctly ?
Hi Tarek,
It’s a very interesting idea – how is distutils going to be distributed? It could be a separate download, a plain python one that people could just put *locally*, side by side with their project and give back results. I suspect it’ll be easier for people to download and untar a package than installing/building python from source. Just a thought.
Orestis
+1 separating distutils out from python.
Would allow easier testing, and contributing to distutils by the community.
Two other reasons for separating distutils from python core…
1. Platforms change at a different rate from python.
2. Older versions of python need the changes.
3. More platforms and requirements are supported than what core-python distutils supports.
It should probably install as a different module name, so as to not break anything.
Calling it ‘distutilsdevel’ could be a good idea.
cu,
@Orestis, right. I was thinking about some kind of nightly build version with a specific installer to avoid compiling Python?
@Rene, Distutils should stay compatible with Python 2.3, so It could be installed over your existing Python (It has a version number). I think it’s just a problem of distribution and smart installation. You are right about the fact that the older version needs the changes made. But Distutils should stay and evolve inside Python trunk, so it doesn’t take the same path than setuptools (evolving on its own). Now for the contribution part, I could maintain a bzr branch and apply contributed patch into the trunk.
@Orestis & Rene: -1 on making distutils separate. That would raise the barrier of installing additional packages.
@Tarek: I like the idea of using PyPI packages for distutils regression testing. How about in addition to you selecting a number of standard build options to use, the packages themselves could signal some options they would like to get tested?
You have to be careful with doing tests on PyPI packages though. Someone malicious could upload a package to PyPI with code that tries to exploit your test machine. I would think that the machine running the tests should be a virtual machine that is returned to pristine state after building each PyPI package. The VM should be locked down so that it can not communicate outside, and you’d only check some port or something from the outside to get build logs and status. There’s much more to this than could really be communicated through a blog comment, but you get the idea…
Hey Tarek,
Thanks for working on that. This is very useful.
@Heiki: I agree with you: distutils should not be separate.
@Gael: thanks for cheering
@Heikki: right, we definitely need to define options when working with a package. I think having these options registered independanlty from the package itself, in some kind of configuration mapping on the test server would be the simplest way to proceed. Now for the security aspect, an isolated VM is probably the most secure way to proceed. In any case, there’s a lot of work … but I think it worths it
hi,
here’s some more argument for decoupling distutils from python – but keeping it in the trunk.
1. Platforms change at a different rate from python. Years after a new python is being used is too late to push out distutils changes.
2. Older versions of python need the changes too. Otherwise users of distutils will have to monkey patch distutils as is done now… to support different versions of python.
3. More platforms and requirements are supported than what core-python distutils supports.
4. Versioning distutils becomes possible. Rather than “distutils that came with python2.5 on ubuntu 8.10”.
distutils should be decoupled from python, so it can be installed separately. Sure — keep it in the python trunk, just make it so it’s available separately too.
ps. here’s a couple of distutils bugs off the top of my head…
– passes -O3 to gcc by default (note -O3 likely breaks lots of code, it is the experimental optimisation flag)
– language “cpp” does not seem to envoke g++
– broken for mingw with each new python release.
[…] we had a uncovered cmp() call left in Distutils by the time the release was made. In the meantime, as I said before, the “real” Distutils regression test suite is held by all the packages out there in […]