Keywords Not Fontified at Startup with `org-in-block-p` in the Matcher

Hi, everyone!

I have been trying to add some custom keywords to the search-based
fontification of org-mode, to add some colors to latex fragments, and
I am running into an issue I cannot seem to resolve. I was hoping that
someone might be able to offer some guidance, any insight or suggestion
is greatly appreciated! :slight_smile:

SHORT DESCRIPTION

The Goal. I have a custom set of keywords for applying faces to
certain tokens in org latex fragments, and I wish to enforce different
faces according as the fragments are within or outside certain blocks
which I consider “mathematical”.

The Current Problem. So in the matcher of the keyword I have added a check
using org-in-block-p, which then doesn’t seem to work on the startup of
an org buffer. Some characteristics of this issue:

  1. As soon as I remove the wrapper around org-in-block-p,
    the rest of the matcher and in fact the entire fontification
    works as intended (apart from the distinct context provided
    by the block); in fact, having removed the org-in-block-p and
    added the keywords to latex-mode, the effect is also as intended.
  2. The predicate org-in-block-p and my wrappers around it also work
    when I invoke them interactively in isolation;
  3. With the org-in-block-p predicate added, however, on startup,
    whether my org file is folded or expanded, there is no fontification
    at all for any latex fragments, regardless of whether they are inside
    one of my list of blocks or outside;
  4. Any subsequent edits after startup are fontified as intended. In particular,
    when I modify any text on the same line or in the same block with an existing
    fragment even trivially (say by inserting a white space), all the fragments on
    the line or in the block get updated fontification, as desired.

These points combined suggest to me that somehow, the interaction between
the predicate org-in-block-p and the font-lock engine is messed up
precisely on the startup of a file.

DETAILS

The Value of the Keywords

In case this is relevant, the list of keywords I wish to add are stored
in a variable called org-latex-fragment-inblock-kws, which after evaluation
in IELM prints:

(((lambda
    (limit)
    (when
	(org-inside-math-block-p)
      (if
	  (save-excursion
	    (and
	     (re-search-forward "\\\\(" limit t)
	     (re-search-forward "\\\\)" limit t)))
	  (progn
	    (re-search-forward "\\\\)" limit t)
	    (re-search-backward "\\\\(" nil t)
	    (re-search-forward "\\\\(" nil t))
	nil)))
  (0 'latex-math-delim-face prepend)
  ("\\(\\\\\\)\\([[:alpha:]]+\\)"
   (save-excursion
     (re-search-forward "\\\\)")
     (re-search-backward "\\\\)")
     (point))
   (progn
     (re-search-backward "\\\\(")
     (re-search-forward "\\\\("))
   (2 'latex-math-cmd-face prepend))
  ...
  ... ;; And some more of these anchored highlighters.
  ... ;; Then finally at the end the highlighter for 
  ... ;; the closing delimiter.
  ...
  ("\\\\)"
   (save-excursion
     (re-search-forward "\\\\)")
     (point))
   nil
   (0 'latex-math-delim-face prepend))))

Here, the whole lambda chunk is the matcher constructed, and within it
the function org-inside-math-block-p is just a simple wrapper around
org-in-block-p which is defined as follows:

(defun org-inside-math-block-p () 
  (org-in-block-p org-math-block-list))

Where org-math-block-list is just a list of strings.

Some Rudimentary Debugging, Hopefully Helpful

I have tried to add a few messages around org-inside-math-block-p for
some minor debugging, including what-line, which informs me that at
startup of an org file, the line the point has been at is always 1.

With the same rudimentary messaging behavior wrapped up interactively
around org-inside-math-block-p, the line number and the detected org
block seem to be correct after startup.

My Own Naive Speculation

Is this related to how org blocks behave at buffer startup, or how
initial fontification is carried out?

Even if the particular behavior of having latex fragments inside
and outside certain blocks fontified differently (and correctly
at startup) cannot be achieved, I should really like to know why,
and possibly learn one thing or two about the font-lock engine
or the internals of org.

There is an error in the logic of overarching matcher which I should mention. It should have been

(lambda
    (limit)
    (when
	(org-inside-math-block-p)
      (if
	  (save-excursion
	    (and
	     (re-search-forward "\\\\(" limit t)
	     (re-search-forward "\\\\)" limit t)))
	  (progn
            (re-search-forward "\\\\(" limit t)
	    (re-search-forward "\\\\)" limit t)
	    (re-search-backward "\\\\(" nil t)
	    (re-search-forward "\\\\(" nil t))
	nil)))

To prevent the case where it finds the right delimiter even when there is no left delimiter between the cursor position and the right delimiter.

But this alone does not resolve the startup malfunction of the org-in-block-p predicate, the real problem is still there after updating my config this way.

Maybe I should, but I still don’t quite understand your goal.

I have done a bit of messing with org LaTeX fontification, but nothing as in-depth as trying to create custom keywords/syntax parsing (which it sounds like you are trying to do).

If your actual goal is more general than this, I might have some insight though. Could you give an example of what you want and maybe some context on why you want that?

Hi, @williamspete001 !

Thanks a lot for your message and for offering to take a look! I truly value
your taking the time.

You are right, indeed there could be times when I think I am conveying what my
goal is and yet my words are making not as much sense. Let me try to first describe
the motivations in detail before taking an example out of my files.

MOTIVATION FOR THIS CUSTOM FONT-LOCKING BEHAVIOR

The Short Term Use Case

Actually, my current immediate use case is probably much simpler than you’d expect:
it is very plainly to make mathematical note-taking more visually intuitive.

For formal notations such as LaTeX in org-mode, the impetus behind distinguishing
between being inside and outside certain blocks is really to separate the formal
from the informal. Specifically, I have formed this tendency to deliberate on
the discernment of the intuition from the formalism, especially when faced with
new concepts, and this “principle” (kind of) has influenced how I try to take notes
since the past few months. Given that org-mode is born with these block constructs
which seem very natural for implementing such a separation, I have now come to
regard org blocks, not just source blocks, as an in-buffer switch between different
contexts, such as that between the intuitive and the formal.

Examples of block types which I consider “mathematical” include “DFN” blocks (for
definitions), “THM” blocks (theorems), “LEM” blocks (lemmas), and the like. Without
the desired fontification behavior, however, I am currently relying on latex SRC
blocks or EXPORT blocks to maintain the font locking aspect, and I am willing to
dive into the export backends to make LaTeX SRC blocks behave as though they are
just LaTeX fragments or environments should the need arise. This could be a valid
workaround, but in principle, it should be possible to make any block behave
differently.

Um, I realize that this priority of visual separation over convenience of the export
process could be viewed as an over-exaggeration from the perspective of usability,
but still it is worth the effort for me, as I take notes as a personal practice,
and the joy and the cognitive ease in writing and reviewing them is, in this way,
more essential than the output.

Potential Long Term Benefits

The above is the short-term feasible goal, but indeed two other longer-term, fuzzier
and broader incentives have crossed my mind. I do not expect to carry them out through
sheer force though, estimating that it could still take a long time for me to reach
the prerequisite knowledge and fluency, and also figuring that there is a chance for
them to trivially reduce to existing functionalities or hacks which I have not yet
learned. Right now I just hope that they can be achieved someday, spontaneously
and as a matter of course. And they are:

  1. Some convenient utility to understand nested scopes and fontify them accordingly,
    implemented via the search-based fontification mechanism of emacs;
  2. Perhaps a generalization of the first and certainly the more difficult, to be able
    to instruct emacs to “identify different contexts” within the same buffer and
    assign to them aspects that are usually expected in a typical minor or major mode,
    beyond font locking.

If equipped with this ability to tell emacs to practically endow different portions of
the same buffer with different sets of modes, I imagine it would be much more flexible
and pleasant to write and compute in emacs in a centralized manner.

AN EXAMPLE

An example of the distinct fontification would be the following snippet. In fact this would
be one of only a few I could deliver, as I am still exploring how I should take notes
in org-mode, and ironically haven’t yet fully returned to the learning of what I actually
want to take notes of, but regardless of the condition it is in, I hope it suffices to illustrate
the intention:

Main reference: Folland.
A measure is a generalized notion of area or volume.

An ideal measure on the Euclidean space \(\RR^n\) would be a function \(\u:
\Pow(\RR^n) -> [0, +\8]\) assigning to each set \(E \subset \RR^n\) a number
representing its "size", with the properties:

- It respects countable disjoint unions.
- Congruent sets have the same measure.
- The measure of the unit cube is \(1\).

However, such an ideal measure does not exist, for as a consequence of certain 
axioms of set theory, these conditions are mutually inconsistent.

#+BEGIN_Proof
To see this, take \(n = 1\) and \(E = [0,1]\), partition it by the equivalence
relation \(x \sim y\) iff \(x - y \in \QQ\). Now, there are countably infinitely
many equivalences classes that are pairwise congruent, denote these equivalence
classes by \(E_r\) then \(\u([0,1] = 1 = \sum_r \u(E_r) = \sum_r c\) for some
constant \(c\), which is impossible.
#+END_Proof

The solution to this situation by current measure theory is to restrict the
domain of \(\u\) to a smaller subcollection of \(\Pow(\RR^n)\) that are
well-behaved in a certain sense.

Here, the LaTeX fragments inside the “Proof” block would be colorful, while
those outside such blocks would be grayscale, because those inside such blocks
are usually richer, longer, harder to parse manually. From the proof a more
intuitive understanding of the failure of the existence of an ideal measure
could be extracted, but the proof is not by itself the intuitive understanding, so
I enclose it with the “Proof” block, signifying that it is a technical detail
rather than the big picture. More often the reverse happens, namely that the
build-up of the intuition precedes the formal verification, but regardless of
the order, the rule I’ve set for myself to follow is to enclose the latter
within a suitable block to tell myself: “it works, I have gone through the
technical details, review them if I need it NOW, otherwise leave them and
focus on the big picture.”


Last but not least, please excuse me if this response is blabbering about, I may
very often not be very good at expressing what ought to be easily conveyable.

Ok, I think I understand better now. To make sure I’m not misunderstanding, you want visual distinction (essentially color coding) of LaTeX fragments that are inside specific org blocks.

While I don’t know much about manually tweaking fortification and font-lock things, I think I have at least a lead that might help you get it without having to figure out all the details of fortification (which might be more trouble than it is worth for what are essentially (at least as I understand) visual reminders of where you are in the document).

For the basic goal of visual tweaking, you could find some way to get Emacs to fontify your specially named blocks as if they were LaTeX blocks.

For example your goal is already possible with LaTeX blocks, you would just need to tell Emacs to treat Proof (or any other block) as a LaTeX block (which is probably much easier than figuring out the fontification yourself).

An example:

A note: org-highlight-latex-and-related needs to be set to true for this to work, but otherwise seems to be just a matter of telling Emacs to treat proof blocks like LaTeX blocks. Since it is being treated as LaTeX org fontification is obviously disabled.

For your more ambitious goal you could look into polymode/polymode: Framework for Multiple Major Modes in Emacs (core library), but I personally found just using C-c ' to do everything I really needed while being a bit more responsive.

Hope that was helpful and good luck!

Hi! I want to say a huge thank you for your suggestion. Sorry for the delayed reply, my mind was on something else yesterday, but now I have just managed to handle the original issue of this thread, and the idea you sketched out (which in my understanding is to recast blocks rather than to parse them manually) is exactly what was needed.

THE DETAILS

Specifically, I created a derived mode from org named laorg-mode (for want of a better name), to which those font lock keywords that work well for latex math mode are added, and then appended it to the variable org-src-lang-modes, this finishes the fontification and the color coding.

And then to scale up the delight, I also (finally) defined a derived backend from latex, currently it is rudimentary and reimplements only the transcoding of laorg SRC blocks, but gladdens me is that it works, and I intend to expand on it so that it also fine-tunes the translation of other org elements and generates latex document for my own document classes.

Thus the color coding of block-aware latex fragments is solved in a way which circumvents the messy details of the font lock engine (which is not to say that the font lock engine itself is not worth exploring), and it is terse and much sharper than what I could have aimed for had you not pointed me to this direction!

About Multiple Modes in a Single Buffer

A slight deviation from the original aim is that the blocks are still SRC blocks, in place of THM blocks or DFN blocks or Proof blocks. This does not cause trouble in my writing any more, I am detailing this out just to express the lesson that, it appears that the role of “context-switching” (by which is really meant “in-buffer major-mode switching”) is more a privilege of SRC blocks than a common right of all the blocks, and I have not yet found an official way (one that is provided by org) to cast special blocks to SRC blocks.

On the extended goal of multiple modes in a single buffer, I haven’t yet but will next check out the polymode package you mentioned.

Correction: Export is Also Important

Also I would like to correct a previous statement that “the export process is less important than the color coding.” This may have been a sour-grapes situation XD, as I did not manage to breach into the export process before… But as I was compiling some latex files to PDFs yesterday, I was reminded of how seeing the symbols and the equations rendered clearly is indispensable for reviewing, reading the latex source alone wouldn’t compare. I should not have denied that the reviewing of the notes in the form of beautifully typeset documents is just as important as the cognitive ease in writing them.

1 Like

Glad you were able to get it working!

To be devil’s advocate here, you can always render the LaTeX directly in Emacs. I’m excited for when the faster LaTeX preview stuff that Tecosaur and Karthink are working on gets done and is merged.

The current system is good enough for occasional peeks at the rendered stuff, but too slow for much more. Xenops mode is pretty cool but I turned it off because it interferes with my template system (it it more focused towards other things and is really fiddly for what I wanted to try it for).

On the flip side, I also think that LaTeX is really not worth it if you never export it. You could easily do some simplifications if you didn’t care about rendering.

A final note, another thing that sounds cool that I will never get around to: Lispy typesetting. I have often wished that LaTeX were more fast, powerful, extensible, and clean. That got me dreaming about an alternative fully built in Lisp. Alas, actually doing that would be way more work than it is worth and I already have more than a lifetime of stuff that I need to do that is more impactful.

If anyone who sees this happens to time travel, I recommend seeing what you can do to try and influence Knuth out of ALGOL and into the Lisp side of history.

Speaking of the new LaTeX preview system, I also watched the demo a while ago and was captivated since, but then seldom hear from it. Now you have mentioned it, I went searching again and it looks like some experimental version can be installed already, via the instructions here: =org-latex-preview=: Set up and troubleshooting. I managed to set it up on an ephemeral emacs directory, it is really fast and readable, does not hinder the fluency of the editing; the minor distraction was that it does not seem to auto-detect definitions in the LaTeX headers, instead there is an org-latex-preview-preamble variable that one should attend to. Apart from that it is quite pleasant, and for some reason yields an almost therapeutic effect on me.

The same dream! Despite having only scratched the surface of TeX and friends, I too share some frustration with LaTeX. And then later, picking up emacs propels me to conceive of a world where TeX was written in elisp. Imagine, org-mode parsed to elisp and then directly to the document, changing the typography becomes a one-sided action instead of going to lengths to keep two sides in artificial alignment. But then, TeX and LaTeX are also beautiful in their own ways, and I am also happy that they exist :smile:

I agree and empathize with this wholeheartedly, not to deny their significance, but tools are here to serve the purposes, they are not the end, and perfectionism over the tools could lead us astray from what we actually want to do and/or can do. This clarity you exhibit is something I should constantly remind myself of.

Speaking of the new LaTeX preview system…

Yeah, I’ve known it is installable for quite some time now. I just have never gotten around to actually doing it.

The same dream!

Yay!

But then, TeX and LaTeX are also beautiful in their own ways, and I am also happy that they exist :smile:

True. I still love them better than the alternatives lol (which is nothing lol). But I can’t help imagine a beautiful DSL for it in Lisp. It feels like a Greenspun’s 10th rule here. The things that make (La)TeX powerful are just a little bit of a proper Lisp done in a more weird way.

I agree and empathize with this wholeheartedly

Yeah, I don’t want to be a second Lispy Knuth. I’ll just work on the stuff that needs it most now.