F40 Change Proposal: SQLAlchemy 2

Summary

The python-sqlalchemy package is upgraded to major version 2. A compatibility package python-sqlalchemy1.4 is added to the distribution to cater for software which doesn’t yet use the new API, this can be installed side-by-side. Other packages using SQLAlchemy are identified and, if necessary, steps are taken to ensure they use the correct major version package.

Owner

Detailed Description

The major version 2 of SQLAlchemy was released early 2023, it contains many useful new features. It breaks compatibility with version 1.4 in many ways, this version is available in Fedora Linux and used by about 40-50 packages today. Version 1.4 allows using the new API so programs can accommodate both major versions, but how many of the packages using SQLAlchemy in Fedora have been adapted accordingly is unknown at this point, this needs to be determined. A parallel-installable compatibility package python-sqlalchemy1.4 will most likely need to be added to the distribution and packages not yet using the new API will have to be changed to use the new API or use this package instead for the time being.

Feedback

Benefit to Fedora

Version 2 of SQLAlchemy has new features including:

  • deep integration with PEP 484 typing practices and current capabilities, particularly within the ORM, allowing e.g. to build ORM models declaratively using Python type annotations
  • fully ORM-integrated approach to bulk INSERT that is typically an order of magnitude faster on most backends

See the upstream release announcement for details.

Scope

  • The proposal owners have to upgrade python-sqlalchemy to the new major version and add the compatibility package to the distribution. They need to find out which of the existing packages in Fedora Linux don’t work with the new version and document how they need to be packaged differently to actually use the compatibility package.
  • Other maintainers of packages using SQLAlchemy need to verify if their packages work with the new version of SQLAlchemy and, if not, do the necessary changes to:
    • make them compatible with the new API, or
    • ensure that the compatibility package is installed and also used by the code during runtime.
  • This change isn’t expected to require involvement of release engineering, it just involves a number of dependent packages high enough to make it impractical to get all their maintainers on-board as proposal owners.
  • Policies and guidelines: N/A (not needed for this Change)
  • Trademark approval: N/A (not needed for this Change)
  • Alignment with Community Initiatives: M/A

Upgrade/compatibility impact

Upgrades would be subject to the potentially changed dependencies between packages, but this should be straightforward and have no compatibility issues.

How To Test

  1. No special hardware or data is needed for testing.
  2. Testers would need to install packages using SQLAlchemy and use them.
  3. Installing the packages should pull in the “right” version, i.e. the default one with the new major version, or the compatibility package.
  4. Using the packages should show no problems, neither on the command line nor in logs or similar.

User Experience

Users using packages already adapted to the new APIs, they should benefit from the new features, including performance enhancements.

Dependencies

A quick scan(*) on Fedora 38 for packages requiring the provides of python3-sqlalchemy yields this list of potentially affected dependent RPM source packages:

  • bodhi-server
  • buildbot
  • datagrepper
  • dionaea
  • dlrn
  • eralchemy
  • ipsilon
  • keylime
  • limnoria
  • mailman3
  • mirrormanager2
  • module-build-service
  • OpenLP
  • pagure
  • pgadmin4
  • pychess
  • python-agate-sql
  • python-aiomysql
  • python-alembic
  • python-databases
  • python-datanommer-models
  • python-flask-security-too
  • python-flask-sqlalchemy
  • python-gertty
  • python-hass-data-detective
  • python-migrate
  • python-odata-query
  • python-opentelemetry-contrib
  • python-oslo-db
  • python-pybids
  • python-pykmip
  • python-repoze-who-plugins-sa
  • python-sentry-sdk
  • python-sqlacodegen
  • python-sqlalchemy-collectd
  • python-sqlalchemy-filters
  • python-sqlalchemy_schemadisplay
  • python-sqlalchemy-utils
  • python-wtforms-sqlalchemy
  • python-zope-sqlalchemy
  • resalloc
  • sigul
  • yokadi

(*) using this command:

rpm -q --provides python3-sqlalchemy | while read prov rest; do
dnf repoquery --qf '%{sourcerpm}' --whatrequires "$prov"
done | \
sed 's|-[^-]*-[^-]*$||g' | \
grep -vFx python-sqlalchemy | \
sort -u

Contingency Plan

  • Contingency mechanism: (What to do? Who will do it?) Proposal owners would have to revert to version 1.4 of SQLAlchemy, dependent packages aren’t expected to have to revert applied changes.
  • Contingency deadline: Beta Freeze
  • Blocks release? No

Documentation

Release Notes

SQLAlchemy was upgraded to version 2 in this release of Fedora Linux with many new features and a revamped API. A compatibility package was put in place to ensure software that depends on the old API stays functional.

This also needs to be announced on devel-announce.


Thanks for working on this! It would be helpful if parts of the proposal were more concrete/specific.

I’d like to see more details on how this will be implemented. I don’t think creating parallel installable python library compat packages like this is possible without horrible hacks. Creating a Conflicts: python3-sqlalchemy compat package is definitely possible. See, e.g., Overview - rpms/python-cython0.29 - src.fedoraproject.org.

How do you plan to handle this? It would be helpful to have a link to a Copr showing which packages (do not) build with the new versions.

Admin note: I’ve changed the post ownership from @zbyszek[1] to @nphilipp to reduce confusion. Posts can’t have multiple owners, so I picked Nils because: 1. he’s listed first and 2. Čestmír hasn’t logged into Fedora Discussion and so does not have an account yet.


  1. posting on behalf of fesco ↩︎

Thanks. I had no idea that this is possible.

(I’m also testing the reply-by-mail feature. Let’s see how this goes.)

1 Like

Apparently OK, except that it removed my quote of part of your message that I included in my reply. It would be nice if it didn’t do that.

I sent a mail now.

1 Like

@nphilipp I would also include dependencies on python3-flask-sqlalchemy. Packages can depend on that and rely on indirect dependency (package → python3-flask-sqlalchemy → sqlalchemy).

Flask-Sqla was updated to a release that support sqlalchemy 2.x, but a package can work with Flask-Sqla and sqlalchemy 1.x but fail with Flask-Sqla and sqlalchemy 2.x (I’ve seen that in our projects).

I am also concerned about the parallel-installable package plan. Please share details.

The quick scan on Fedora 38 for packages requiring the provides of python3-sqlalchemy seems to yield incomplete results.

The complicated for loop in the proposal seems to yield the same amount of results as:

$ repoquery -q --repo={fedora,updates} --releasever 38 --whatrequires python3-sqlalchemy --qf '%{sourcerpm}' | pkgname | sort -u
...
43 packages here
...

However, this omits all the buildtime-only dependencies (which could be queried with repoquery -q --repo={fedora,updates}{,-source} --releasever 38 --whatrequires python3-sqlalchemy | grep src$ | pkgname | sort -u).

When I put it all together and use rawhide, I get 56 packages:

$ sort -u <(repoquery -q --repo=rawhide{,-source} --whatrequires python3-sqlalchemy | grep src$ | pkgname) <(repoquery -q --repo=rawhide --whatrequires python3-sqlalchemy --qf '%{sourcerpm}' | pkgname)
bodhi-server
buildbot
dionaea
dlrn
eralchemy
fedmsg
ipsilon
keylime
limnoria
mailman3
mirrormanager2
module-build-service
odcs
OpenLP
pagure
pgadmin4
pychess
python-agate-sql
python-alembic
python-anykeystore
python-aws-xray-sdk
python-beaker
python-dask
python-databases
python-fastapi
python-flask-security-too
python-flask-sqlalchemy
python-gertty
python-hass-data-detective
python-jsonpickle
python-kombu
python-libpysal
python-logbook
python-migrate
python-odata-query
python-opentelemetry-contrib
python-oslo-db
python-pybids
python-pykmip
python-pymssql
python-pynetdicom
python-repoze-who-plugins-sa
python-sentry-sdk
python-slackclient
python-sphinxcontrib-websupport
python-sqlacodegen
python-sqlalchemy
python-sqlalchemy-collectd
python-sqlalchemy-filters
python-sqlalchemy_schemadisplay
python-sqlalchemy-utils
python-testing.postgresql
python-wtforms-sqlalchemy
python-zope-sqlalchemy
resalloc
sigul

When the command is run on Fedora 38 it yields 6 more and 1 less packages:

+datagrepper
+python-datanommer-models
+python-factory-boy
+python-geopandas
+python-pandas
+yokadi
-python-slackclient

If python3-flask-sqlalchemy can cope with either, I think it should work if the package consuming it does the necessary (Python) package requirement dance to pin the sqlalchemy version. But they need to be included in the list, right.

Sorry for replying a week late, but I didn’t notice that the conversation has started here (maybe was a little overzealous in turning Discourse notifications down :see_no_evil:).

Edit: Looks like I wasn’t super zealous though, the “replies” list in the menu above gives me a number of review related conversations I haven’t been part of… Kind of overwhelmed by this TBH.

1 Like

Thanks! The few times I need it, I always struggle a bit with querying potentially affected users.

Define “horrible hacks” :wink:.

My (boring) idea was to install 1.4 somewhere else than the sqlalchemy directory in site-packages, and packages dependent on the old version would need to add that directory to their sys.path before importing sqlalchemy. Not sure if this would need to go in a convenience module.

Hmm, I wouldn’t be super thrilled if users had to decide whether they want to install established, not-yet-adapted software, or new things which dropped compat with 1.4.

Python being Python, building packages alone tells us little. It’d probably be a mixture of checking what upstreams state and trying to run things… But now you mention it, the onus should probably be on the maintainers themselves as they know their packages better and on me to inform them that they need to do it.

fedrq whatrequires -F source python3-sqlalchemy gives the 56 packages as that pipeline but is much simpler. fedrq whatrequires -F source python3-sqlalchemy python3-flask-sqlalchemy includes 5 more packages.

Yes, I was thinking of installing the old sqlalchemy in an alternative location and then requiring packages to modify sys.path in __init__.py or manually patch imports.

I think both of these are terrible hacks. If a package depends on sqlalchemy < 2.0.0, the dependency generators will work just fine, but then it won’t work (ImportErrors will be raised), because you changed where the package is installed. Even if you tell the maintainers of all the existing packages, this will cause a lot of confusion.

Modifying sys.path is also a problem. What happens if one package depends on the old sqlalchemy and another on the new sqlalchemy, and they are imported in the same process? Hard to debug issues. One of the packages will get the wrong sqlalchemy version depending on which is imported first.

Well, I think it’s better than a hacky compat package that breaks common expectations about how Python packages are supposed to work and diverges from upstream.

Sure, Python is a dynamic language, but packages should run unit tests or at least an import smoke check that will catch basic issues. In the Python SIG, we do test rebuilds in Copr all the time when updating libraries and often catch issues in advanced.

IMHO, the Change owners should preform a basic impact check/rebuild in Copr and be available to help out package maintainers if possible. Pushing a major update without making any attempt to check other packages is not a good idea in my book.

1 Like

Why not rename it to sqlalchem14? Then all old packages can replace the
import sqlalchemy
with
import sqlalchemy14 as sqlalchemy

Imports should be fixable with sed:

sed -e ‘s/import sqlalchemy([^ ]) as /import TMPSA14\1 as /’ -e
's/import sqlalchemy([^ ]
)/import TMPSA14\1 as TMPSA\1/’ -e ‘s/from
sqlalchemy([^ ]* import )/from sqlalchemy14\1/’ -e ‘s/TMPSA/sqlalchemy/g’

That would avoid the issue of having to deal with issues where one
package depends on the new and one on the old version.

Renaming the package would work for apps, but not for libraries like flask-sqlalchemy which can cope with both and may be depended upon by apps that can cope only with either one or the other – these have to be able to import sqlalchemy and get the version that their consumer uses.

Moving the package out of site-packages would mean anything that uses it needs to be patched both on code level and or metadata level (dependencies). I would very much prefer a conflicting regular compact package (like we have for cython or mistune).

Thanks, I didn’t know that tool, very helpful!

Originally (when I filed the Change proposal), I was thinking about something along the line of pkg_resources.require("...") before import, but AIUI this has been deprecated and I have no idea if it works properly today – and it would suffer from the same issues re: metadata and conflicting requirements between different components (if they didn’t both use pkg_resources.require("...") which only would make the whole affair fail early but would likely be yet another downstream mainstream burden).

You’ve convinced me, thanks! I’ll update the Change Proposal to take the comments into account, but will probably only finish after I’ve returned from vacation.

3 Likes