Introducing Metasource (or Metasaurus)

This is a followup to the topic “The Future of MDAPI”[1] I opened about a week back - mostly about how the tests in MDAPI unexpectedly broke due to an uninformed change[2] that broke how MDAPI works.

I wrote a workaround that deferred to using the SQLite3-based metadata sourced from Kojipkgs[3] for the current Fedora Linux Rawhide branch - an approach which I was not happy with and had problems[4].

On Thursday, I pushed the workaround to PyPI[5] and GitHub[6] while making a deployment on the staging environment[7] and I planned on pushing those to the production environment a week after[8].

Thankfully, my proposal of reverting back the creation of SQLite3-based metadata was accepted[9] but this again is a compromise and I would like for it to be back to normal after two releases[10].

I do not like SQLite3, but I hate XML as much. Moving away from SQLite3-based metadata to having only XML-based metadata felt like a step back[11]. Now that the change was done, the support was left[12].

For lookup purposes in services, SQLite3 is significantly faster than XML due to the inherent indexing support, faster query performance, lesser memory usage and concurrent parsing scalability.

Over the weekend, I worked on Metasource[13] (or call it Metasaurus - I do not give a damn) to introduce XML support to the REST API from ground-up instead of bending MDAPI to fit the XML needs[14].

I have deployed it here[15] for the testing purposes. You have been warned - while it is written in Go and has most of the modern optimizations, performance compromises might be observed against MDAPI[16].

Only one Fedora Linux Rawhide buildroot snapshot is made available as a source in my virtual machine, so you can take the codebase for a spin locally and test out the following resources to see for yourself.

  1. https://metasource.gridhead.net/rawhide/pkg/<packname>
  2. https://metasource.gridhead.net/rawhide/files/<packname>
  3. https://metasource.gridhead.net/rawhide/changelog/<packname>
  4. https://metasource.gridhead.net/rawhide/suggests/<packname>
  5. https://metasource.gridhead.net/rawhide/enhances/<packname>
  6. https://metasource.gridhead.net/rawhide/requires/<packname>
  7. https://metasource.gridhead.net/rawhide/provides/<packname>
  8. https://metasource.gridhead.net/rawhide/obsoletes/<packname>
  9. https://metasource.gridhead.net/rawhide/conflicts/<packname>
  10. https://metasource.gridhead.net/rawhide/recommends/<packname>
  11. https://metasource.gridhead.net/rawhide/supplements/<packname>

Please feel free to let me know your feedback. I plan on polishing the codebase a bit further with the automation to import the XML-based metadata and compliance quality checks before staging this up.


  1. ↩︎

  2. ↩︎

  3. ↩︎

  4. ↩︎

  5. ↩︎

  6. ↩︎

  7. ↩︎

  8. ↩︎

  9. ↩︎

  10. ↩︎

  11. ↩︎

  12. ↩︎

  13. ↩︎

  14. ↩︎

  15. https://metasource.gridhead.net/ ↩︎

  16. ↩︎

2 Likes

The metasource looks like useful project, but do we need to implement your own solution? As a Fedora Infrastructure we should try to add as little as possible to our already big maintenance burden. Isn’t there already something used by other distros that we could leverage instead of implementing something new?

Also Is this a replacement for MDApi or just something that will run together with it?

@zlopez, I intend to replace MDAPI with this service.

As mentioned above, in the current state - MDAPI makes use of the SQLite3 metadata and with the SQLite3 metadata generation being deprecated in createrepo_c, it is imperative that we come up with a solution that is native to XML metadata. That is what this provides while maintaining a 1:1 parity with the REST API schema of MDAPI (and probably deployed on the same hostname) for same users.

Since using XML metadata is slower than SQLite, why writing a new tool from scratch based on XML? packages.fedoraproject.org already found the solution: download the repository metadata and recreate the SQLite database locally with sqliterepo_c.
I haven’t had time to look at the code yet, but I think it’s much more simpler to adjust MDAPI than developing a new tool (which, by your word, will run slower). Am I missing some pieces?

Hey :waving_hand:,

I (and some more folks from Fedora Infra) had questions about the maintenance of the SQLite3 database generation feature of createrepo_c, now that the command line flag has been marked as DEPRECATED. The first draft reads the XML document linearly but as the depth of the document is flat, we could not quite make use of Xpath to speed things up and hence had to resort to using the createrepo_c headers from within the Go code.

I am concerned about the code that I am interfacing with though. Is the related functionality expected to be dropped in the future (eg. Headers like sqlite.h etc.) or can I continue using them for a relatively longer period of time? TIA for the context.

Thanks and regards,
Akashdeep Dhar
Red Hat Community Platform Engineering
t0xic0der@fedoraproject.org
akashdeep@redhat.com

Metasource has been overhauled to use Sqlite3-supporting headers of Createrepo_C[1] to convert XML metadata from SQLite3 metadata. This should help speed up the queries. Feel free to take it for a spin[2].


  1. ↩︎

  2. ↩︎