OK, I figured it out I think. There is actually no Noto Sans CJK in the default fallback order! This is what I get:
“Noto Sans”(s) “Noto Sans”(w) “Noto Sans”(w) “Noto Sans”(w) “DejaVu Sans”(w) “Verdana”(w) “Arial”(w) “Albany AMT”(w) “Luxi Sans”(w) “Nimbus Sans L”(w) “Nimbus Sans”(w) “Nimbus Sans”(w) “Helvetica”(w) “Nimbus Sans”(w) “Lucida Sans Unicode”(w) “BPG Glaho International”(w) “Tahoma”(w) “Open Sans”(w) “Co
mfortaa”(w) “URW Gothic”(w) “Nimbus Sans”(w) “Nimbus Sans Narrow”(w) “Carlito”(w) “Noto Sans Math”(w) “Mingzat”(w) “Padauk”(w) “Nuosu SIL”(w) “Droid Arabic Kufi”(w) “Droid Sans Armenian”(w) “Droid Sans Devanagari”(w) “Droid Sans Ethiopic”(w) “Droid Sans Fallback”(w) “Droid Sans Georgian”(w) “Droid Sans Hebrew”(w) "
Droid Sans Japanese"(w) “Droid Sans Tamil”(w) “Droid Sans Thai”(w) “Nachlieli”(w) “Lucida Sans Unicode”(w) “Yudit Unicode”(w) “Kerkis”(w) “ArmNet Helvetica”(w) “Artsounk”(w) “BPG UTF8 M”(w) “Waree”(w) “Loma”(w) “Garuda”(w) “Umpush”(w) “Saysettha Unicode”(w) “JG Lao Old Arial”(w) “GF Zemen Unicode”(w) “Pigiarniq”(w)
“B Davat”(w) “B Compset”(w) “Kacst-Qr”(w) “Urdu Nastaliq Unicode”(w) “Raghindi”(w) “Mukti Narrow”(w) “malayalam”(w) “Sampige”(w) “padmaa”(w) “Hapax Berbère”(w) “MS Gothic”(w) “UmePlus P Gothic”(w) “Microsoft YaHei”(w) “Microsoft JhengHei”(w) “WenQuanYi Zen Hei”(w) “WenQuanYi Bitmap Song”(w) “AR PL ShanHeiSun Uni”(
w) “AR PL New Sung”(w) “MgOpen Modata”(w) “VL Gothic”(w) “IPAMonaGothic”(w) “IPAGothic”(w) “Sazanami Gothic”(w) “Kochi Gothic”(w) “AR PL KaitiM GB”(w) “AR PL KaitiM Big5”(w) “AR PL ShanHeiSun Uni”(w) “AR PL SungtiL GB”(w) “AR PL Mingti2L Big5”(w) “MS ゴシック”(w) “ZYSong18030”(w) “TSCu_Paranar”(w) “NanumGothic”(w
) “UnDotum”(w) “Baekmuk Dotum”(w) “Baekmuk Gulim”(w) “KacstQura”(w) “Lohit Bengali”(w) “Lohit Gujarati”(w) “Lohit Hindi”(w) “Lohit Marathi”(w) “Lohit Maithili”(w) “Lohit Kashmiri”(w) “Lohit Konkani”(w) “Lohit Nepali”(w) “Lohit Sindhi”(w) “Lohit Punjabi”(w) “Lohit Tamil”(w) “Meera”(w) “Lohit Malayalam”(w) “Lohit Kan
nada”(w) “Lohit Telugu”(w) “Lohit Oriya”(w) “LKLUG”(w) “FreeSans”(w) “Arial Unicode MS”(w) “Arial Unicode”(w) “Code2000”(w) “Code2001”(w) “sans-serif”(w) “Roya”(w) “Koodak”(w) “Terafik”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “ITC Avant Garde Gothic”(w) “URW Gothic”(w) “sa
ns-serif”(w) “sans-serif”(w) “Helvetica”(w) “Helvetica Narrow”(w) “Nimbus Sans Narrow”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w) “sans-serif”(w),
However, fontconfig and Qt have different rules for picking a font. Qt just goes through the fallback font list, and finds the first font that has the needed character (I think). That is Droid Sans Fallback (!), which happens to use Chinese glyphs.
Fontconfig uses a more complicated matching system. First it filters out all fonts that don’t have the required characters, then it looks at language selection before even looking at the fallback font list. “Droid Sans Fallback” has lang: ja|zh-tw(w)
, which does not match en
, so it gets filtered out here. And now there are no fonts with CJK characters left in the fallback font list, so it just randomly picks the first font it finds which is left among the rest of the fonts, which happens to be Noto Sans CJK JP
by chance.
If your language is set to jp
though, the configs do put Noto Sans CJK JP into the list. So that leads to the answer! I think the correct way to set Han glyph preference is like this:
<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "urn:fontconfig:fonts.dtd">
<fontconfig>
<match target="pattern">
<edit name="lang" mode="append"><string>ja</string></edit>
</match>
</fontconfig>
That basically says that, after your system or text language, fontconfig should prefer fonts for the Japanese language. Dropping that into .fonts.conf.d/05-language-fallback.conf
makes KDE do the right thing, without having to hard code font fallback order or specific fonts! It’s basically the same thing Android does, it’s just that (as far as I can tell) it’s not really possible to set a list of languages in Linux and have that propagate to fontconfig in all frameworks and everything else, so you have to do it directly in fontconfig.
I guess KDE maybe should do this with the existing language list in system settings? It feels like it would be a useful addition…