How to pull activity data from Pagure

Hi, I’m looking to pull activity data from Pagure Quick Docs repo.

  • Data fields: merged PRs, commits made, closed issues, unique count of authors / active authors, number of lines changed
  • Time data: when PR was merged, when issue was closed, when commit was made

Purpose: I would like to highlight past 12 months activity data in F38 release party.

Could you point me to which data source (datagrepper??) and filters I need to use? Even better if there is a pre-built dashboard like Cauldron.

Thank you!

Repo URL

3 Likes

Hi,

I’m using this script to generate weekly statistics for our Team, so it could be helpful in regards of the issues.
And I also have this for a better review of the PRs and issues made by our team.

Hopefully this will help a little.

Here is a Cauldron report looking at commits made to the Quick Docs repo:

Fedora Quick Docs stats year over year.

This does not include rich data from Pagure’s API about pull requests and issues though.

EDIT: One thing I thought was neat is that Saturdays are big days for Quick Docs commits!

thank you both @zlopez and Justin. Really helpful insight. + CHAOSS report

Hello Justin, I forked the report and added other docs repos like Server/CoreOS/Silverblue.

A fork is here.

Below is list of docs URL by repo added to the Cauldron report ‘Fedora Docs Repo’.

Reporting period: 2022-06-01 - 2023-06-01
bokeh_plot(1)

CentOS installation guide is noted as ‘other’.

I can’t add Cloud because it is in readthedocs.

1 Like

Nice chart here! I was looking at the CHAOSS metrics tab on your fork. It appears that contributions made on weekdays are more common in the other repositories you added to the forked report, compared to only the Quick Docs repo. I did not look too closely though.

It also makes me wonder about the spread of drive-through & repeat contributors. Are the repeat contributors only appearing on some repositories but not others, or is it a more even spread? Which repositories receive the most drive-through contributors?

The answers to these questions would be useful to know whether there are repositories that do a good job of converting drive-through contributors to repeat contributors, and/or whether there are repositories where people often get stuck.

The CHAOSS metric is really intriguing. But looking at query parameters and time range, it is uncertain if drive-through and repeat contributor counts show sustained contributions.

Definition

Drive-through contributors are contributors who make less than the required 5 contributions in the specified time period.

Authors retained ratio and active authors are another interesting metric.

A practical way to collate feedback is to run poll during and after Docs presentation 3 June and engage with Q&A at the same time.