How to automatically keep track of packages installed/removed by user using dnf

Note, I edited the topic to link it more closely to the specific issue that was discussed here. “Running a script after a command” is a much more general issue.

2 Likes

Do you manually update that list?

Generally, isn’t running dnf repoquery --qf "%{name}" --userinstalled simpler?

Yeh :smile:

Not in my case. I install lots of packages, often just to test them to give karma or to build software before I submit it for review, and I don’t always remember to remove them. The list I maintain is stuff that I know I use regularly, so it’s been vetted for my personal use.

Here’s the output of the repoquery command, for example. Notice how it differs from my personal curated list:

$ sudo repoquery --qf "%{name}" --userinstalled
[sudo] password for asinha:
0ad
Coin2-devel
ImageMagick
SIMVoleon-devel
adobe-release-x86_64
anka-coder-condensed-fonts
anka-coder-fonts-common
anka-coder-narrow-fonts
anka-coder-norm-fonts
aria2
backintime-qt
bat
biber
byobu
clang-devel
cmake
cowsay
cscope
dconf-editor
deja-dup
dia
docker
doxygen
dracut-live
endless-sky
enscript
expat-devel
expatpp-devel
fedora-easy-karma
fedora-packager
fedora-review
flash-plugin
fortune-mod
freeglut-devel
gimp
git-all
gitg
gnome-shell-extension-pomodoro
gnome-tweaks
gnucash
gnuplot
gstreamer1-plugins-bad-free-extras
gstreamer1-plugins-bad-nonfree
gtypist
htop
httpie
hw-probe
initscripts
inkscape
java-1.8.0-openjdk
klt-devel
langpacks-en
latexmk
lftp
libXi-devel
libXmu-devel
libdc1394-devel
libgeotiff-devel
libjpeg-turbo-devel
libpng-devel
memtest86+
minizip-compat-devel
mpd
mpich
mpich-devel
mpv
msmtp
nautilus-dropbox
ncmpcpp
ncurses-devel
neomutt
neuron
newsboat
notmuch
notmuch-mutt
notmuch-vim
offlineimap
open-sans-fonts
openjpeg2-devel
openmpi
openmpi-devel
openttd
options
packit
parcellite
pass
patchutils
pdfpc
perl-Perl-Critic
pew
podman
psutils
pwgen
python3-devel
python3-dnf-plugin-tracer
python3-elephant
python3-flake8
python3-ipython
python3-matplotlib-wx
python3-natsort
python3-neuron
python3-pandas
python3-pyopengl
python3-qrcode
python3-rstcheck
python3-unidecode
python3-websocket-client
qt5-qtwebengine-freeworld
qutebrowser
rcm
readline-devel
rply-devel
rpmfusion-free-release
rpmfusion-nonfree-release
shapelib-devel
syncthing
syslinux
task
texi2html
texlive
texlive-acronym
texlive-algorithmicx
texlive-appendix
texlive-beamerposter
texlive-beamertheme-metropolis
texlive-biblatex-nature
texlive-ccicons
texlive-chktex
texlive-chronosys
texlive-classicthesis
texlive-classicthesis-doc
texlive-datetime
texlive-draftwatermark
texlive-endfloat
texlive-epigraph
texlive-epstopdf
texlive-eulervm
texlive-floatflt
texlive-fontawesome
texlive-forest
texlive-framed
texlive-frankenstein
texlive-glossaries
texlive-glossaries-english
texlive-hfoldsty
texlive-imakeidx
texlive-import
texlive-inlinedef
texlive-iwona
texlive-jneurosci
texlive-kpfonts
texlive-lacheck
texlive-lastpage
texlive-latex
texlive-latexdiff
texlive-libertine
texlive-mathdesign
texlive-mdframed
texlive-minted
texlive-multirow
texlive-nag
texlive-newfile
texlive-noto
texlive-onlyamsmath
texlive-opensans
texlive-pgfplots
texlive-preprint
texlive-regexpatch
texlive-silence
texlive-siunitx
texlive-stix
texlive-texdoc
texlive-textgreek
texlive-tocloft
texlive-todonotes
texlive-ulem
texlive-wrapfig
texlive-xargs
texstudio
the_silver_searcher
tikzit
timew
urlscan
vifm
vim-X11
vim-enhanced
vimiv
vit
w3m
weechat
xerces-c-devel
xindy
xsel
zathura
zathura-plugins-all
zlib-devel

Now that I see how many extra packages I have, I’ll go remove them :laughing:

I see. In your case, having it automated after each install would not be desirable. It could be better having “checkpoints” such as a baseline for a fresh install with your package list, or at backup intervals.

Then after experimenting with as many packages as you like, you could easily find out what to remove with something like:

comm -13 <(sort [/baseline/package/list]) <(sort $(dnf repoquery --qf "%{name}" --userinstalled))

Option flag -13 says to omit column one (lines unique to first file) and column three (lines in both files). You could also add --total to see a summary count for each column.

You could pass that list to dnf remove and … :beach_umbrella:

@fasulia, my two cents: you can add --cacheonly to you command:

dnf --cacheonly repoquery --qf "%{name}" --userinstalled

This will allow you to

  1. Force dnf not to refresh metadata even if it’s expired. As you’ve discussed, you don’t need to refresh metadata to know which packets were installed by user locally.

  2. Run the command as ordinary user (not root and without sudo), but using root’s dnf cache in read-only mode (as far as I understand).

One more way to (maybe) optimize your process a bit (again, you’ve discussed it earlier) would be to make a cron job (which runs as often as you need it too) to update the list somewhere locally, then when you run your backup you don’t need to refresh the list, you just backup the list that’s already there.

1 Like

Nice! Updated the script.

And I just learned something from the documentation:

-C, --cacheonly

Run entirely from system cache, don’t update the cache and use it even in case it is expired.

DNF uses a separate cache for each user under which it executes. The cache for the root user is called the system cache. This switch allows a regular user read-only access to the system cache, which usually is more fresh than the user’s and thus he does not have to wait for metadata sync.

Each user has its own cache… this explains why even immediately after a dnf operation that requires root and refreshing the metadata, using dnf as a non-root often still requires refreshing the cache. Confusing, but now makes sense!

I also noticed in the logs how often dnf-makecache.service runs and wondered why dnf still wants to update the metadata cache. Now I know. dnf -C [...] can benefit from the system cache and save time.

2 Likes

Yep, it’s quite obvious if you try not to keep user’s cache as I do. usually if I slip and run some dnf command an my ordinary user, I run dnf clean all to remove user’s cache (as I don’t need two copies of cache on my system, and I can do almost no useful things with user-only cache).

There are actually configuration parameters for this, they’re quite easy to find (though I don’t remember them from the top of my head). If I remember correctly, it’s set to refresh cache every 4 hours – but I can be completely wrong. I’ve seen it, but had no reason to change it from the default.

and --cacheonly is a very useful switch for some situations, yep)

–refresh is the same for the opposite ones – as you’ve mentioned above.

One more bit from documentations regarding metadata sync:

METADATA SYNCHRONIZATION

Correct operation of DNF depends on having access to up-to-date data from all enabled repositories but contacting remote mirrors on every operation considerably slows it down and costs bandwidth for both the client and the repository provider. The metadata_expire (see dnf.conf(5)) repository configuration option is used by DNF to determine whether a particular local copy of repository data is due to be re-synced. It is crucial that the repository providers set the option well, namely to a value where it is guaranteed that if particular metadata was available in time T on the server, then all packages it references will still be available for download from the server in time T + metadata_expire.

To further reduce the bandwidth load, some of the commands where having up-to-date metadata is not critical (e.g. the list command) do not look at whether a repository is expired and whenever any version of it is locally available to the user’s account, it will be used. For non-root use, see also the --cacheonly switch. Note that in all situations the user can force synchronization of all enabled repositories with the --refresh switch.

1 Like