Code block highlight is often wrong/unnecessary

The common uses of code blocks here don’t work well with Discourse’s language detection and syntax highlighting.

  • The language detection performs poorly on short samples like shell one-liners.
  • For output only, logs, or other plaintext, highlighting is always wrong. It might look nice by coincidence, but it’s essentially random highlighting of numbers or symbols.
  • For commands + output, even if it is detected or specified as bash, the highlighting is meant for bash scripts, so the output is wrongly highlighted.

I suggest disabling this feature. Users can specify the language to get correct highlighting when needed (rarely, since this is not a programming forum[1]).

It might also have minor savings in page loading since the detection and highlighting is done on client-side JS (?).


Examples

All examples don’t have any language specified after the triple-backticks.

  • This is detected as CSS:

    dnf list --installed kernel
    
  • Lua:

    $ mount --all
    
  • Ruby:

    $ tar xvf foo.tar.gz
    
  • Some common shell built-ins are detected as bash:

    echo foo
    
  • But other shell built-ins aren’t (CSS again):

    time foo
    
  • Journal output, detected as Apache:

    Apr 24 16:07:40 asuja kernel: last_pfn = 0x26f000 max_arch_pfn = 0x400000000
    Apr 24 16:07:40 asuja kernel: x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT  
    Apr 24 16:07:40 asuja kernel: last_pfn = 0x8f800 max_arch_pfn = 0x400000000
    
  • dnf history, detected as SQL:

    $ dnf history info last
    Transaction ID : 757
    Begin time     : Mon 24 Apr 2023 14:31:26 +08
    Begin rpmdb    : da3a6d93b2dc2d1044bd471a85ed5d2554dc4f59bc01b64a79d8b040d12413a3
    End time       : Mon 24 Apr 2023 14:31:29 +08 (3 seconds)
    

  1. Even on language-specific forums, it doesn’t work well. For example, this thread on Python Discourse; the first post is detected as CSS, and the second post quotes a section from the first post but it’s detected as C++. ↩︎

2 Likes

Hmm. You make a convincing argument. The poor auto-detection is probably something to bring upstream (to the library used), but in the meantime I’ll disable it.

1 Like

I don’t fault it for not working well on one-liners; it’s a guessing game at best. Highlighting mixed code + output is also probably out of scope for highlight.js.

Makes sense. An option for “don’t guess on one-liners, only guess if confidence is high for multi-line” might be nice.

Just wondering if there was a change. New posts still have auto-detection on code blocks.

Test:

if auto-detect is on, this will be highlighted