Ticket #505 (new enhancement)

Opened 4 years ago

Last modified 3 years ago

Autocompletion for Notation Definitions

Reported by: kohlhase Owned by: vzholudev
Priority: major Milestone:
Component: System Implementation (SI) Version: v0.1.3
Keywords: Cc: clange, cmueller, dmisev, uholzer
Blocked By: Blocking:
Due to close: YYYY/MM/DD Include in GanttChart: no
Dependencies: Due to assign: YYYY/MM/DD

Description

At the moment the patterns for notation definitions only work for the format they are specified for, i.e. OpenMath?, Pragmatic MathML, or Strict MathML. But these formats are equivalent: OM and StrictCMML directly and Pragmatic CMML via  http://www.w3.org/TR/2009/WD-MathML3-20090604/chapter4.html#contm.p2s. Therefore it would be possible to generate equivalent patterns in all formats and thus complete the notation definitions, making them much more usable.

It would probably be enough to have the full set of patterns internally, to ensure maintainability.

Change History

  Changed 4 years ago by clange

IIUC you suggest that if the patterns in a notation definition are, e.g., in OpenMath?, the equivalent CMML representation should also be generated internally before rendering. Maybe this is not even necessary; instead one could slightly redesign the implementation: From all patterns in the notation definitions, and from all content markup input to be rendered, one could generate a uniform abstract representation (either, let's say, actual OpenMath? XML, or even more efficient internal data structures) and then do pattern matching on that level. Would that be hard to implement?

follow-up: ↓ 3   Changed 4 years ago by kohlhase

That is also a very good idea, i.e. use some internal OM object and make use of the fact that Strict Content MathML is a OM encoding. That would work.

BUT Pragmatic Content MathML I guess would have to be dealt with separately, since it is much less regular.

in reply to: ↑ 2 ; follow-up: ↓ 4   Changed 4 years ago by clange

Replying to kohlhase:

BUT Pragmatic Content MathML I guess would have to be dealt with separately, since it is much less regular.

Let me ask a more precise question. Let P be some pragmatic CMML, and S its equivalent strict CMML. Are there cases when render(P) is different from render(S)?

  • If so, it gets complicated.
  • If not, we could also translate any pragmatic CMML that we find, both in patterns in notation definitions, and in content-markup input, to its equivalent strict representation and do the internal pattern matching on the strict representation.

in reply to: ↑ 3 ; follow-up: ↓ 5   Changed 4 years ago by kohlhase

Replying to clange:

Replying to kohlhase:

BUT Pragmatic Content MathML I guess would have to be dealt with separately, since it is much less regular.

Let me ask a more precise question. Let P be some pragmatic CMML, and S its equivalent strict CMML. Are there cases when render(P) is different from render(S)? * If so, it gets complicated. * If not, we could also translate any pragmatic CMML that we find, both in patterns in notation definitions, and in content-markup input, to its equivalent strict representation and do the internal pattern matching on the strict representation.

no, the main idea here is that the renderings should be the same, but we could have to generate the following 'extended notation definition':

<pattern>
<set>
  <bvar> <expr name="var"/> </bvar>
  <domainofapplication> <expr name="dom"/></domainofapplication>
   <expr name="body"/>
</set>
</pattern>
<pattern>
<apply><csymbol cd="set1">map</csymbol>
  <bind><csymbol cd="fns1">lambda</csymbol>
    <bvar> <expr name="var"/> </bvar>
     <expr name="body"/>
  </bind>
   <expr name="dom"/>
</apply>
</pattern>

If we are not scared of this, then this should work well.

in reply to: ↑ 4   Changed 4 years ago by clange

Replying to kohlhase:

no, the main idea here is that the renderings should be the same, but we could have to generate the following 'extended notation definition': {{{ <pattern> <set> <bvar> <expr name="var"/> </bvar> <domainofapplication> <expr name="dom"/></domainofapplication> <expr name="body"/> </set> </pattern> <pattern> <apply><csymbol cd="set1">map</csymbol> <bind><csymbol cd="fns1">lambda</csymbol> <bvar> <expr name="var"/> </bvar> <expr name="body"/> </bind> <expr name="dom"/> </apply> </pattern> }}} If we are not scared of this, then this should work well.

Sure, but IIUC the second prototype (we've call it prototype, not pattern, so far) in equivalent to the first one in that the first one is pragmatic, and the second one is its strict form. And then we have two choices of generating the second one: either as in that listing, as strict CMML/XML, or directly in a more efficient internal representation, if we choose to go for that.

  Changed 3 years ago by nmueller

  • owner changed from nmueller to vzholudev

follow-up: ↓ 8   Changed 3 years ago by dmisev

An efficient internal representation that matches prototypes irrelevant whether they are OpenMath?/MathML sounds very interesting, I could start working on this.

The ultimate goal would be to eliminate the use of XOM completely (at least in the notation module), and use a pull parser to render a document on the fly. While at it, I could also try making the renderer fully compliant with the notations specification (if I'm not mistaken some things are not supported now).

in reply to: ↑ 7   Changed 3 years ago by clange

Replying to dmisev:

The ultimate goal would be to eliminate the use of XOM completely (at least in the notation module)

So you mean that you would no longer compare literal XOM elements, but custom Java objects? Yes, that sounds like a good solution. I.e. om:OMA would no longer be considered different from m:apply, but both would be parsed into org.jomdoc.something.Apply. I'd suggest reusing as much from the existing JOMDoc library as possible. We already have org.omdoc.jomdoc.lang.math.om with one class per OpenMath? element. For complete math objects we have lang.MObject and its subclasses lang.mml.CMObject and om.OMObject. I would suggest introducing common superclasses for all elements of OpenMath? and CMML, e.g. lang.Apply with subclasses lang.om.OMA and lang.mml.Apply.

and use a pull parser to render a document on the fly. While at it, I could also try making the renderer fully compliant with the notations specification (if I'm not mistaken some things are not supported now).

At the moment there is no such thing as an up-to-date specification. The "notation" chapter 19 of  the OMDoc 1.3 spec points to the OMDoc 1.6 spec, which is wrong. There is  an outdated draft of how the notations in OMDoc 1.6 might look like.

Can you be more specific about what you think is not supported in the implementation but should be? I suppose the most reliable "specification" is the current implementation plus a few open tickets.

  Changed 3 years ago by dmisev

Yes Java objects, I was thinking of reusing the OpenMath? objects in JOMDoc plus defining objects for prototype, rendering, etc.

At the moment there is no such thing as an up-to-date specification.

I know of the Notations for Living Mathematical Documents paper, and I think there were some things that are not supported, but I'm not sure which exactly.

  Changed 3 years ago by clange

BTW, one more comment: When you start with the unified OpenMath?/MathML representation, bear in mind that MathML's csymbol does not have a cdbase attribute. I.e. there is no straightforward translation of OMS/@cdbase. The mechanism – in the general case – is so ridiculously complicated and impractical (see  this e-mail thread for details) that I'd recommend

  • either not implementing it for now
  • or deviating from the MathML spec and "implementing" it as csymbol/@cdbase, leaving the post-cleanup to some other tool that we won't implement.

follow-up: ↓ 12   Changed 3 years ago by dmisev

Ok, thanks for letting me know! The second option seems better to me.

Here's how I plan to do this ticket:

  1. Integrate  org.symcomp.openmath into JOMDoc - there were a few emails about this library on the JOMDoc mailing list, and it seems much better than what we have in JOMDoc as it supports many different OpenMath? formats. And if OMDoc allows non-XML OpenMath? formats in some future version (for example  POPCORN seems very nice), we'll be ready for it.
  2. Use this as a basis for the unified OpenMath?/MathML representation, by implementing a translator for Strict Content MathML first (to the org.symcomp.openmath objects), and then for Pragmatic Content MathML according to this algorithm  http://www.w3.org/TR/MathML3/chapter4.html#contm.p2s
  3. Finally integrate all this into the rendering.

What do you think?

in reply to: ↑ 11   Changed 3 years ago by clange

Replying to dmisev:

Here's how I plan to do this ticket: ...

That's an excellent plan! Note that we know the developers of that library (Peter Horn and Dan Roozemond) quite well, so don't hesitate to ask them questions (and say hello to them from KWARC).

follow-up: ↓ 15   Changed 3 years ago by dmisev

I've just found out that this library depends on Scala runtime - scala-library.jar is 6MB, including it in JOMDoc would almost double it's distribution size, so I'm not sure do we care too much about JOMDoc's size? Probably not because it's mostly used by server systems like TNTBase or SWiM, but there's also Gemse for example for which size probably matters. But if we include it, we'll be free to develop JOMDoc in Scala as well, I think for most things in JOMDoc it might actually be more suitable than Java.

Alternatively there's the  RIACA OpenMath Library, but it doesn't seem different from what we have in JOMDoc.

  Changed 3 years ago by vzholudev

  • cc uholzer added

Let's CC Urs for that.

in reply to: ↑ 13 ; follow-ups: ↓ 17 ↓ 18   Changed 3 years ago by uholzer

Replying to dmisev:

I've just found out that this library depends on Scala runtime - scala-library.jar is 6MB, including it in JOMDoc would almost double it's distribution size, so I'm not sure do we care too much about JOMDoc's size? Probably not because it's mostly used by server systems like TNTBase or SWiM, but there's also Gemse for example for which size probably matters. But if we include it, we'll be free to develop JOMDoc in Scala as well, I think for most things in JOMDoc it might actually be more suitable than Java.

The size of Gemse is at the moment (release 1.0.12) about 2.5MB in size, including JOMDoc and the libraries it requires to do what Gemse wants it to do. Including the Scala library would increase this to 8.5MB. I don't think that users care that much whether they have to download 8.5MB instead of 2.5MB in order to install Gemse. And since you have to install Gemse locally anyway (Java based features can not be used in the demo version of Gemse), this should not be a big problem.

But I once thought about rendering formulas on the client side on a web page using a java applet. There it would certainly increase loading time of the page. Although I think 12MB would not be the end of the world, especially because a web page should already be shown while its applets are still loading.

Considering the size of scala-library.jar, it probably contains a lot of stuff which is not used at all by the library you plan to use. Maybe you could split it up somehow?

  Changed 3 years ago by uholzer

Please preserve the following features:

  • When generating parallel markup, the Content part should remain the same (from an XML point of view). The content MathML or OpenMath? I hand over to JOMDoc for rendering even contains some attributes in my own private namespace and I rely on them being present in the Content part of the output of JOMDoc. (Very important for me)
  • It would be great if it would be possible (is it right now?) to include attributes and elements in my own namespace in the prototype which then also are compared when doing the matching. (Not so important for me)

in reply to: ↑ 15 ; follow-up: ↓ 19   Changed 3 years ago by clange

Replying to uholzer:

Considering the size of scala-library.jar, it probably contains a lot of stuff which is not used at all by the library you plan to use. Maybe you could split it up somehow?

Without knowing the SCIEnce OpenMath? library I guess that is nothing that we can easily do. I'd suggest to do the following:

  1. use the SCIEnce library (which requires the Scala library)
  2. notify the SCIEnce developers of the problem (by whichever way they prefer: bug tracker, mailing list, …) and ask them to provide a version without the Scala library, if possible

in reply to: ↑ 15   Changed 3 years ago by dmisev

Replying to uholzer:

Considering the size of scala-library.jar, it probably contains a lot of stuff which is not used at all by the library you plan to use. Maybe you could split it up somehow?

I thought the same and tried to remove some stuff, but without some tool to compute which things are exactly used it's impossible.

When generating parallel markup, the Content part should remain the same (from an XML point of view). The content MathML or OpenMath?? I hand over to JOMDoc for rendering even contains some attributes in my own private namespace and I rely on them being present in the Content part of the output of JOMDoc. (Very important for me)

Yes, this should stay the same.

It would be great if it would be possible (is it right now?) to include attributes and elements in my own namespace in the prototype which then also are compared when doing the matching. (Not so important for me)

At the moment no, but I'll see if it's possible to make it work.

in reply to: ↑ 17   Changed 3 years ago by dmisev

Replying to clange:

2. notify the SCIEnce developers of the problem (by whichever way they prefer: bug tracker, mailing list, …) and ask them to provide a version without the Scala library, if possible

They only use Scala for the individual OpenMath? objects, like OMSymbol, OMApply, etc. (which are just bean classes..), all the rest is done in Java, so I think it should be easy to provide a Java only version. I'll ask them about this.

follow-up: ↓ 21   Changed 3 years ago by dmisev

I emailed Peter and the problem is solved, at least for now - we'll use version 1.4.0 of the library which is Java only, whereas the newest version 1.5.0 mostly consists of the rewrite in Scala, so there's not much difference.

in reply to: ↑ 20   Changed 3 years ago by clange

Replying to dmisev:

I emailed Peter and the problem is solved, at least for now - we'll use version 1.4.0 of the library which is Java only,

thanks for solving it so quickly!

whereas the newest version 1.5.0 mostly consists of the rewrite in Scala, so there's not much difference.

AFAIK the SCIEnce project will end soon, but still there might be new features or bugfixes coming up in future versions of the library, so we need to be prepared for eventually reintroducing Scala.

(In the very long run, the Scala question will come back anyway, as we'll have to merge JOMDoc with Florian's MMT for OMDoc 1.6.)

Note: See TracTickets for help on using tickets.