And then every week, we (the CentOS Stream) look at the results, and if they passed, we release the latest compose out to mirrors, push new images, etc.
I’d like to discuss a good contribution model to the t_functional test suite. Right now we merge whatever works and “feels right”. I also have some open questions:
What if a test starts failing? Do we hold the release? For how long?
Should there be a subset of blocking tests that we hold a release for, and a separate set of just informative tests we won’t hold a release for?
Also, right now the tests live on Github, and then it’s synced to a second repo (Overview - centos/t_functional - CentOS Git server) and that’s where Jenkins sees it. This is a legacy setup, the Github repo was set up in the past mostly to get pull requests I’m told.
So that might be a second thing to discuss — what if we consolidated these two? Since we now have an official CentOS Stream space on Gitlab, perhaps we could move it there?
Yeah, I think moving these to gitlab would make them easier to discover and to contribute to. The other thing I’d suggest is to consider putting together a post of the CentOS Blog to more widely advertise these exist and are open for contributions.
Right now, I check this at least every morning (after the compose runs). Currently we don’t release unless it passes. If it fails we see why and fix the issue. The Release … no need to release something broken.
What we need is a way to test pull requests from wherever they come from (github now, gitlab later if we move) before the PR is approved.
Right now, before approval, I run the suite and make sure everything in the change works. The problem is, while it works on a ‘normal shell’, some tests actually fail on the limited shell we seem to run CI tests on within Jenkins. This leads to things seeming to work but actually failing when run in CI.
It should be easy to build a form with the PR number and actually run that PR on the actual system so fails show up before approval on the actual CI system. We would also want the ability to run on every version and arch. This can be a manually run test that needs to happen as part of the PR approval process.
There is a third talking point, which maybe deserves a separate thread: what kind of tests we put into t_functional and what should go into package specific dist-git tests(Test Management Tool :: Fedora Docs).
I think ideally for package-specific tests in t_functional we should share the same test and run it in the package gate and in the compose gate somehow.
That would be implemented with the concept of tiers (tmt has it). However, what it is a second tier for someone, not worth holding an entire compose for, may be critical for others. Not sure how that would work out.
It would be great to be able to reproduce the limited shell locally. I had two tests disabled but no way to reproduce the problem locally or at least more info to see if I could reproduce it. I think the issue about shell differences that can cause problems should be resolved or at least documented in a way contributors can reproduce locally. It will give contributors a better development experience.
I started moving some tests also to the packages. I think it is valuable there as well, though not sufficient, because the compose the tmt test runs against in Zuul CI will be different than the compose the t_functional test runs against. An update to systemd breaking podman socket activation comes to mind as a situation that a tmt package level test might have not caught on time.
As you said, ideally, we have only one place for tests specifications, and just run them in both places. One way of accomplishing that is by moving t_functional to tmt, and just import the tests from the package repos, e.g.,
tmt would also give other features like tiers, and test reporting that the team may be interested in. As I mentioned in the matrix channel, we may be able to automate some of that migration with some code generating scripts, and then figure out the integration with duffy.
The t_functional suite has been around since CentOS Linux 5. It may not be the best way to do this or the contents may not be the best tests. It is just what we have had as a community for doing releases for quite a long while.
I am in no way opposed to doing something better or doing this in a better way, this is just what we have now. When this was created, there were no ‘dist-git tests’ … CentOS did not even use dist-git at the time.
I think that t_functional is valuable and important and I am glad we have it. I also like that it has simple structure.
So I am not proposing of replacing it, rather I think that we would benefit if we learn how to run some of its tests additionally also earlier in the process.
If we only run tests in the compose stage we are limited in the options how to deal with the failures. We basically can only wait and not ship the compose until things are fixed.
If we run the same tests at the package-level first we would be able to catch package-specific errors there. And then we would still run it again on the compose, to verify that new failures were not introduced by a certain combination of packages.
I think that if we indeed go forward with the idea of using TMT for test management, the main criteria of success should still be the simplicity of the test structure, so that we keep the barrier to contribution low.
People should not be overwhelmed by learning too much of the test framework to add a simple bash script as an integration test.
From the test suite engineers standpoint, the t_functional team, the necessity of a test management tool is unavoidable, whether it is tmt or custom solution written in bash. The essential complexity of it is something they would be dealing with. Those that manage and run the tests will need test execution, reporting, unified stdout/stderr handling, unified exit handling, test selection by distro, arch, tiers, troubleshooting, test grouping, test enabling/disabling, test documentation, etc… An example of that need is the existence of runtests.sh, serial-tests.sh, skipped-tests.sh and the pull request #95.
In this aspect, I think this team can benefit from adopting tmt, which is feature-complete, tested, and has a community behind continually developing it. Converting t_functional to tmt would also bring the benefit of facilitating the importing of tons of existing tmt tests from other repos, including the package-level ones from centos-stream, or even from fedora itself.
From the test contributor development experience standpoint, tmt is still running their tests written in bash. They can use tmt to select their tests with one command and run it locally, or they can simply go to their tests folder and run their bash script directly. Even the use of beakerlib is optional in tmt, but beakerlib itself is just bash. The other work the contributor would need to do is to create a .fmf companion file to their bash script tests where the contributor would need to write documentation of what the test is about.
tmt has a decent learning curve, but once you get it, you end up liking it. That has been my experience. I use it at my job for internal ci of updates against critical centos official repo packages and third-party packages and programs.
If you all would like to explore this avenue, I can setup a PoC
From what I understand, “plans” are sort of classes of tests, and this example has three:
the old t_functional
a new generation “tmt native” tests
Random idea: What if we gave any CentOS SIG a test repo they could put tests into, even without review from the CentOS Stream team, and run them in the notify only mode? And then we could put them in properly after discussing it. But they could at least have something instantly when needed. But that’s just an idea I got as playing with this, would it even be useful? I guess opening a MR against the primary repo isn’t a problem. Brainstorming!
@asamalik , I don’t see much feedback on this. Not sure if it is people just too busy or it is being discussed through another channel.
If you want we can continue to develop this PoC into a more complete solution by defining the tiers, adding more tests from t_functional, etc… I just wanted to make sure there was interested before investing more time. Let me know.
I was out of the office last week, I will review in more details later today.
I think that indeed we should not wait for the full migration of all tests from the current repo, and setup a new compose testing pipeline in Jenkins right away, using a repo from the SIG Gitlab space.
Regarding tiers, do I understand correctly that tier just an arbitrary text label, and one test can belong to only one tier?
You have set it up so that there are tests for old stuff and tests/ng for new format, while I would rather do the tests/legacy for old and tests for new. So that new tests definition look like the regular default one.
If it is a simple naming thing can it be changed or do you see other reasons to setup ng/ tests separately?
As I understood we would technically need a two step conversion process:
A tier is just a tag. The original need for a first-class concept of tiers was for security advisory short time testing, and the desire to give that tag an official name. That’s what I understand from here: Core — tmt documentation
Thanks for the link, I see that we have both tags and tiers.
I guess the main use of tier for us should be if we want to fail the test early if tier0 tests are failing.
For example if the package is podman and we have a sanity test which runs podman --version and a more elaborate test which runs a certain meaningful scenario, we should make the first tier0, and do not trigger the second if tier0 has failed.
But this seems to be overlapping with the more flexible functionality provided by plans and tags, so maybe we can just leave the tier parameter aside for now.
Yes, that works
One thing we would probably need to consider is restrictions on the usage of external tests. We can not depend on just any random repo out there. At the beginning we should probably restrict external imports to the CentOS Stream dist-git. And then choose carefully when to expand it to upstream repos.
Yup. It should be limited to centos-stream repos only. The tmt test in the podman repo in centos-stream has a bug that makes it fail. I forked it and fixed it just for this PoC. That’s why it is pointing to my fork in gitlab. I haven’t had the chance to post an MR to the original podman repo for that fix but I’ll do it shortly.