Skip to content

clojure-mode locks up when font-locking strings with many escaped characters #589

Closed
@plexus

Description

@plexus

Steps to reproduce the problem

Run cider-pprint-eval-last-sexp on something that returns a large string (mine was about ~140k characters), with multiple escaped characters.

This completely freezes up my Emacs, the only way to get out of it is with a killall -SIGUSR2 emacs, which then gives you a stracktrace of where it was at the time.

Debugger entered--entering a function:
* #f(compiled-function () #<bytecode 0x22a293b5c0344f>)()
  clojure-string-start(t)
  clojure--font-locked-as-string-p()
  clojure-font-lock-escaped-chars(149240)
  font-lock-fontify-keywords-region(1 149240 nil)
  font-lock-default-fontify-region(57 557 nil)
  apply(font-lock-default-fontify-region 57 557 nil)
  #f(compiled-function (beg end &rest rest) #<bytecode 0x1cea7771e9712d09>)(57 557 nil)
  font-lock-fontify-region(57 557)
  #f(compiled-function (fun) #<bytecode -0x154a37361c2bbd80>)(font-lock-fontify-region)
  run-hook-wrapped(#f(compiled-function (fun) #<bytecode -0x154a37361c2bbd80>) font-lock-fontify-region)
  jit-lock--run-functions(57 557)
  jit-lock-fontify-now(57 557)
  jit-lock-function(57)
  redisplay_internal\ \(C\ function\)()

Having a look at those top few functions, it seems that clojure-font-lock-escaped-chars walks through the string

(defun clojure-font-lock-escaped-chars ...
  ...
  (while (and (not found)
    (re-search-forward "\\\\." bound t))
    (setq found (clojure--font-locked-as-string-p)))

For each backslach it calls clojure--font-locked-as-string-p, which calls clojure-string-start (twice if it's not a regex), which does a backwards regex search

(defun clojure-string-start ...
  ...
        ;; Find a quote that appears immediately after whitespace,
        ;; beginning of line, hash, or an open paren, brace, or bracket
        (re-search-backward "\\(\\s-\\|^\\|#\\|(\\|\\[\\|{\\)\\(\"\\)")

I'm not familar enough with this code or with font-locking to judge exactly what it's doing, but it does look to be doing something rather inefficient. Maybe clojure-font-lock-escaped-chars can precompute the end of the string, instead of checking the font-locking to see if it's iterated past the end of the string? Or does clojure--font-locked-as-string-p need to clojure-string-start check? it seems to check if it's font locked as string AND inside a string, which seems a bit redundant.

Environment & Version information

clojure-mode version

Include here the version string displayed by M-x clojure-mode-display-version. Here's an example:

;; Version: 5.13.0-snapshot

Emacs version

GNU Emacs 28.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.20, cairo version 1.16.0) of 2021-02-19

Operating system

Description:    Ubuntu 20.04.2 LTS

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions