| 72 | | |
| 73 | | % local packages (my legacy) |
| 74 | | % \input{../macros/mathe} |
| 75 | | |
| 76 | | % \hypersetup{bookmarksdepth=4} |
| 77 | | |
| 78 | | % % Page styles |
| 79 | | % \pagestyle{scrheadings} |
| 80 | | % \clearscrheadfoot |
| 81 | | % \ohead{\headmark} |
| 82 | | % \ofoot[\pagemark]{\pagemark} |
| 83 | | % \setheadsepline{0.3pt}[\color{gray}] |
| 84 | | % \setkomafont{pagehead}{\normalfont\small\sffamily\slshape} |
| 85 | | % \setkomafont{pagenumber}{\normalfont\small\sffamily\slshape} |
| 86 | | % |
| 87 | | % Listing styles |
| 88 | | \RecustomVerbatimEnvironment{Verbatim}{Verbatim}{fontsize=\footnotesize,commandchars=\\\{\},codes={\catcode`$=3\catcode`$=3}} |
| 89 | | |
| 90 | | % \lstset{float=htb,columns=flexible,frame=lines,language=[omdoc]XML,basicstyle=\scriptsize, |
| 91 | | % indexstyle=\indextt,indexstyle=[1]\indexelement,indexstyle=[2]\indexattribute, |
| 92 | | % numbers=left,stepnumber=5,numberstyle=\tiny,showstringspaces=false} |
| 93 | | % \lstset{ |
| 94 | | % basicstyle=\footnotesize\ttfamily, |
| 95 | | % basewidth=.55em, |
| 96 | | % lineskip=-.35ex |
| 97 | | %} |
| 98 | | % |
| 99 | | % % Array setup |
| 100 | | % \newcolumntype{v}[1]{>{\raggedright\arraybackslash\hspace{0pt}}p{#1}} |
| 101 | | |
| 102 | | % % TikZ setup |
| 103 | | % \usetikzlibrary{arrows,positioning} |
| 104 | | % \tikzstyle{default}=[font=\sffamily,>=triangle 60] |
| 105 | | % \tikzstyle concept=[font=\sffamily\bfseries,draw,minimum height=3.5ex,rounded corners] |
| 163 | | The research in the field of CAS has received mass recognition in May 2009 when \wa~\cite{URL:WolframAlpha}, a computational knowledge engine, was launched. An example of a CAS is Mathematica~\cite{URL:Mathematica}, whose 7\textsuperscript{th} version was released in the first quarter of 2009, a system that provides many possibilities in interacting with mathematical formulae. \wa is based on two primary resources for the answers it provides (as it is classified as being an ``answer engine''), the Mathematica backend and the knowledge base. From a mathematical point of view, the way this works is that \wa always tries to return everything that it knows about a certain formula (factorization, roots, plot). |
| 164 | | |
| 165 | | An integration of the services provided by \wa with the \jobad architecture makes sense, as this would facilitate the users' immediate access to more information regarding the formulae that the user explores in a document, thus providing, besides the already existing information services (e.\,g.: definition lookup), another way of acquiring background information regarding the topic, thus making it easier for users to understand complex mathematical formulae. This data is instantly computable and computed and is available for access via the \wa website or via a webservice API specially designed for developers. Still, \wa is only one example of a CAS and the \jobad architecture should not be confined to only using this one. Another example of a similar system with which JOBAD could interact are those CAS-es that deal with OpenMath content and which can be reached via the SCSCP protocol~\cite{HorRoz:ossp09}. |
| 166 | | |
| 167 | | The work envisioned with this project has two main parts that, in turn, regard the improvement and extension of the already existing JOBAD architecture~\cite{GLR:WebSvcActMathDoc09}. The idea is to develop another module for the already existent JOBAD architecture that will allow the user to interact with more web-driven mathematics, via the \wa computational knowledge engine. The extension also regards a generalized method of a ``Send To'' menu that will allow the user to redirect an annotated MathML fragment (formula) to some other sources of information, in this case, a CAS, in particular, \wa. \wa was chosen as an initiator for the ``Send To'' method, as this seems the most useful, rational and complex choice for a user who wants to look up mathematical content on the web, as \wa is also capable of plotting different functions, identify equations, terms etc. Still, this would be only a test case for the menu, as this can be further expanded and further destinations for the ``Send To'' menu can be provided. This new extension, in theory, should work with any CAS, the only constraint being the CAS and the mathematical content in the document should have a common language --- such as OpenMath ---, or that there should be a one-to-one mapping between these languages. In figure \ref{jobad} you can see an updated diagram of the entire \jobad architecture, with the components in red being the ones to be added (also, the proxy will be redesigned). |
| | 94 | The research in the field of CAS has received mass recognition in May 2009 when \wa~\cite{URL:WolframAlpha}, a computational knowledge engine, was launched. An example of a CAS is Mathematica~\cite{URL:Mathematica}, whose 7\textsuperscript{th} version was released in the first quarter of 2009, a system that provides many possibilities in interacting with mathematical formulae. \wa is based on two primary resources for the answers it provides (as it is classified as being an ``answer engine''), the Mathematica backend and the knowledge base. From a mathematical point of view, the way this works is that \wa always tries to return everything that it knows about a certain formula (factorization, roots, plot) or already knows about a certain formula (as one can see in Figure ~\ref{wares}). |
| | 95 | |
| | 96 | An integration of the services provided by \wa with the \jobad architecture makes sense, as this would facilitate the users' immediate access to more information regarding the formulae that the user explores in a document, thus providing, besides the already existing information services (e.\,g.: definition lookup), another way of acquiring background information regarding the topic, thus making it easier for users to understand complex mathematical formulae. This data is instantly computable and computed and is available for access via the \wa website or via a webservice API specially designed for developers. Still, \wa is only one example of a CAS and the \jobad architecture should not be confined to only using this one. Another example of a similar system with which JOBAD could interact are those CASs that deal with OpenMath content and which can be reached via the SCSCP protocol~\cite{HorRoz:ossp09}. |
| | 97 | |
| | 98 | \begin{figure}[htp] |
| | 99 | \centering |
| | 100 | \includegraphics[width=\textwidth]{./img/wa.png} |
| | 101 | \caption{\small{\wa results for ``subset''}} |
| | 102 | \label{wares} |
| | 103 | \end{figure} |
| | 104 | |
| | 105 | |
| | 106 | The work envisioned with this project has two main parts that, in turn, regard the improvement and extension of the already existing JOBAD architecture~\cite{GLR:WebSvcActMathDoc09}. The idea is to develop another module for the already existent JOBAD architecture that will allow the user to interact with more web-driven mathematics, via the \wa computational knowledge engine. The extension also regards a generalized method of a ``Send To'' menu that will allow the user to redirect an annotated MathML fragment (formula) to some other sources of information, in this case, a CAS, in particular, \wa. \wa was chosen as an initiator for the ``Send To'' method, as this seems the most useful, rational and complex choice for a user who wants to look up mathematical content on the web, as \wa is also capable of plotting different functions, identify equations, terms etc. Still, this would be only a test case for the menu, as this can be further expanded and further destinations for the ``Send To'' menu can be provided. This new extension, in theory, should work with any CAS, the only constraint being the CAS and the mathematical content in the document should have a common language --- such as OpenMath ---, or that there should be a one-to-one mapping between these languages and that there is a way to specify the wanted query from the CAS in the interface. The difference between \wa and other CASs is that \wa will automatically return plots, derivatives, related formulas, while these, if supported, have to be explicitely asked for in the other CASs. This functionality needs to be embedded in the respective service GUI elements and will have to be adjusted per CAS. In figure \ref{jobad} you can see an updated diagram of the entire \jobad architecture, with the components in red being the ones to be added (also, the proxy will be redesigned). |
| 184 | | As previously stated, there are two ways in which the \wa content can be accessed, one of them being via the actual web interface, while the other one involves using the \wa web service API that allows user to query for content from other applications. As the first part seems infeasible for an automated client, we have applied for and obtained a grant for research purposes on the \wa architecture, which consists of a \wa API key. Once there is one possible way to query the \wa engine, the only problem that still remains is establishing a common language for the interaction between our document and \wa. |
| 185 | | |
| 186 | | As \wa does not support querying mathematical content via semantically enriched MathML (content MathML) or OpenMath, one is required to query the engine via other mechanisms, this being the first important step towards querying the knowledge base. The common syntax for querying seems to be the Mathematica language which is familiar to the \wa engine, as it relies on a Mathematica backend, therefore requiring a translation from the existing Content MathML and OpenMath standard served by the server (in this case TNTBase or MMT) to Mathematica. We are aware of existing transformations between OpenMath and Mathematica, e.\,g.\ on the server as a part of the MathDox infrastructure~\cite{mathdox:translation:on,CuypCoheKnop2008g4}, or on the client as a part of the Sentido formula editor~\cite{Palomo:06}, and currently evaluating how to reuse them for our purposes. |
| 187 | | |
| 188 | | Also, taking into consideration the limitations of JavaScript, we will need to set up a proxy in order to access \wa, as JavaScript code is not allowed to provide data unless it comes from the same domain~\cite{URL:SOP}. The purpose of the proxy will be to only prepare a request for forwarding to the \wa server and, when the data is available, to provide it back to the JavaScript client, given that the key that we received from \wa cannot be exposed and will be stored in the proxy. Also, depending on the nature of the data, the proxy might alter the structure of the retrieved document in order to save post-processing on the client side. |
| | 123 | In the initial phase of the project, there were two ideas around which the retrieving of data revolved. First of all, there was the brute force way, query the \wa website, wait for it to load (as it contains a lot of JavaScript and AJAX requests) and retrieve the content on the webpage via XPath or something similar and display it in the associated dialog box. Still, there were some issues regarding this: |
| | 124 | \begin{itemize} |
| | 125 | \item The \wa website only provides instant results image-wise. The images are generated on-the-fly and then deleted shortly after the AJAX request has been completed; therefore, there was no way to return the images which are rather important in the case of functions. |
| | 126 | \item The results were displayed via JavaScript and AJAX which would mean that retrieving the loaded page through the Java proxy would be hard, if not impossible. |
| | 127 | \item One can not set the content of the retrieved results which is, by default images, even for mathematical formulae. |
| | 128 | \item Further content cannot be retrieved or filtered (we are interested in retrieving the meaning of the formulas, not just the graphical representation) |
| | 129 | \end{itemize} |
| | 130 | |
| | 131 | The \wa service provides a web-based API for clients to integrate the computational and presentation capabilities of \wa into their own applications or web sites. The \wa webservice allows one to query the database as if one were to query the actual website, but also providing additional functionality. As the above solution seems infeasible for an automated client, we have applied for and obtained a grant for research purposes on the \wa architecture, which consists of a \wa API key. Regarding the extra functionality provided by the API, the output of the \wa webservice can be filtered according to the developer's wish in order to better accustom the needs of the user. The filtering regards the various possible representations of the result, providing options for visual representations (e.g.: images, HTML, PDF) or textual representation (e.g.: plain text, Mathematica format, ExpressionML~\cite{URL:ExpressionML} etc.). Once there is one possible way to query the \wa engine, the only problem that still remains is establishing a common language for the interaction between our document and \wa. |
| | 132 | |
| | 133 | As \wa does not support querying mathematical content via semantically enriched MathML (Content MathML) or OpenMath, one is required to query the engine via other mechanisms, this being the first important step towards querying the knowledge base. The common syntax for querying seems to be the Mathematica language which is familiar to the \wa engine, as it relies on a Mathematica backend, therefore requiring a translation from the existing Content MathML and OpenMath standard served by the server (in this case TNTBase or MMT) to Mathematica. We are aware of existing transformations between OpenMath and Mathematica, e.\,g.\ on the server as a part of the MathDox infrastructure~\cite{mathdox:translation:on,CuypCoheKnop2008g4}, or on the client as a part of the Sentido formula editor~\cite{Palomo:06}, and have decided to use the MathDox infrastructure for translation. |
| | 134 | |
| | 135 | Also, taking into consideration the limitations of JavaScript, we will need to set up a proxy in order to access \wa, as JavaScript code is not allowed to provide data unless it comes from the same domain (``Same Origin Policy'')~\cite{URL:SOP}. The purpose of the proxy will be to only prepare a request for forwarding to the \wa server and, when the data is available, to provide it back to the JavaScript client, given that the key that we received from \wa cannot be exposed and will be stored in the proxy. Also, depending on the nature of the data, the proxy might alter the structure of the retrieved document in order to save post-processing on the client side. The purpose of the proxy is to interact with the ``outside world'' of the application, retrieve necessary content, fit it together nicely and then return it to the user. Given the choice of different formats in which \wa can return the result, this proxy divides the task in two subtasks: a content oriented task and a general information retrieval task. |
| | 136 | |
| | 137 | Regarding the content oriented task of the proxy, the workflow can be vizualized in figure \ref{proxy_cont} and will be also explained in the next sentences. The \jobad architecture was thought to be used, at first, in the General Computer Science lecture notes at Jacobs University which are written using s\TeX (semantically enriched \TeX) \cite{Kohlhase04:stex,Kohlhase:ulsmf08} and are transformed and hosted in TNTBase \cite{ZhoKoh:tvsx09:biblatex,TNTBase:demo}. Still, the system is not constrained to that and can be used in much more useful purposes. For example, the LATIN project (Logic ATlas and INtegrator)~\cite{LATIN:url}, which aims to use a ``logics as theories/translations as morphisms'' approach to achieve the interoperability of both system behavior and represented knowledge (the Logic Integrator), and to obtain a comprehensive and interconnected network of formalizations of logics of computational logic systems (the Logic Atlas). Another example in this area can be given from history: in the beginnings of the 20\textsuperscript{th} century, a group of (mainly) French mathematicians has started writing the basis of set theory and published under the common pseudonym: Nicolas Bourbaki. But, for this collection of books, there is no digitized version which would allow the users to explore (e.g.: the basis of set theory) properly. |
| | 138 | |
| | 139 | \begin{figure}[htp] |
| | 140 | \centering |
| | 141 | \includegraphics[width=\textwidth]{./img/jobad_proxy.png} |
| | 142 | \caption{\small{Proxy architecture - context oriented}} |
| | 143 | \label{proxy_cont} |
| | 144 | \end{figure} |
| | 145 | |
| | 146 | The test case for the integration of \wa services into \jobad was the previously mentioned lecture notes which are usually displayed in MathML (Presentation) and also have content annotations in OpenMath. As the annotations are made up to a per symbol level, it is easy for the \wa service of the \jobad architecture to find the associated OpenMath representation of the selected text and make a POST request to the proxy which runs on the same domain and port (due to the ``Same Origin Policy''). The proxy will then determine if the content is OpenMath and, in this case, will send a request to a webservice running on the MathDox~\cite{URL:MathDoxOM2M} website which will translate the OpenMath content to a Mathematica expression. As \wa is based on Mathematica (the plots, expansions etc. are computed via Mathematica), the Mathematica language is easier to understand by the engine and the computed results are more relevant to the search, as no Natural Language Processing techniques need to be employed to transform the input (e.g.: on a basic level, an input as ``Sqrt[x]'' might produce more relevant results than ``square root of x''; for this simple test case, the results are identical, but for more complex queries, NLP processing might not work). So, the converted OpenMath expression is then passed to \wa for evaluation in two steps: the first request is for Mathematica output (a representation of the formula in Mathematica language) and is directed towards the content and meaning of the formula, while the second request is sent in order to retrieve pictures and a Presentation MathML representation of the results. |
| | 147 | |
| | 148 | |
| | 149 | Given that the first query was successful (which can be easily verified in the result of \wa query), the system should proceed in transforming the retrieved Mathematica content to a displayable form (Presentation MathML), while still preserving the associated content annotation. For this, the following possibilities have been investigated: |
| | 150 | \begin{itemize} |
| | 151 | \item \emph{NB2OMDoc}: Developed by Klaus Sutner NB2OMDoc~\cite{Sutner:cmnto04} is a Mathematica package that is able to transform Mathematica code (version 4.2, latest version is 7) to OMDoc (version 1.2). The disadvantages of this system would be that it requires Mathematica to be installed on the proxy computer and that it is designed for an old format of both Mathematica and OMDoc. In adition to that, one would have to transform (render) the OMDoc content to MathML (Presentation and Content MathML), step which would be provided by TNTBase and JOMDoc~\cite{JOMDoc:web}, a Java API for OMDoc documents (and illustrated in the picture). |
| | 152 | \item \emph{Mathematica web service}: As pointed out here [\url{http://reference.wolfram.com/mathematica/XML/tutorial/MathML.html}], Mathematica is capable of exporting its formulas to both Content and Presentation MathML. So, one can design a web service that would start Mathematica, input a formula, convert it to MathML and then retrieve the result. This is not feasible, as the Mathematica files (with extension \emph{nb}) have a proprietary format and extracting content from that file is not easy. Also, another drawback is that one would have to start Mathematica each time (as we are not aware of a Mathematica daemon) which, even on a new computer, takes more than 10 seconds which makes a webservice not user friendly. An example in this area is WITM~\cite{URL:WITM}, Web Interface to Mathematica which provides a Mathematica interaction inside the browser. Still, the main constraint is that WITM (and similar attempts) is intended to allow a small number of licenced users access to Mathematica kernels remotely, but not simultaneous (a large number of users might mean interference in the result) |
| | 153 | \item \emph{Sentido formula editor}: Developed by Alberto Gonzalez Palomo as part of the Sentido~\cite{URL:SentidoFormEditor} editor, browser and environment for OMDoc, it is a JavaScript extension that allows the translation between different mathematical formulae representation formats. This would mean that all the translation between the different formats (Mathematica to OMDoc) should be done on the client side. The drawbacks of using this method is that the entire library is necessary for this and there seems to be no interface to just transform between the different formats, without enabling the other features. |
| | 154 | \end{itemize} |
| | 155 | |
| | 156 | Since each of the methods presented above has its own (major) drawbacks and would require more time to integrate, we consider the integration of a Mathematica to OpenMath/OMDoc/MathML translator as future work. |
| 208 | | One of the last features to be investigated upon is the extension of the existing document format with a welcome screen and a setting preserving feature. For example, the use cases for this feature would be when the user first visits a document or when the user revisits a certain document. When the user arrives for the first time on a document, he is requested to choose the services that will be made available to that webpage (e.\,g.: definition lookup, CAS, etc.) and these are saved for the case in which the user returns such that the next time the page is loaded, only the previously selected services are loaded, thus reducing the load time and the load of the server, attempting an approach to optimization of the entire webpage loading. The main idea is that it should not be obtrusive to the user and, in case the user is for the first time visiting a document, to load a reasonable subset of the services, a set that can be later decided upon via the provided visual interface. The data will be saved in the browser (most likely a cookie for each document), thus allowing easy retrieval of the data. Also, a button for interactively modifying the preferences for a certain document will be made available. This can be further expanded in a server-side saved user profile that can be used in other projects as the adaptive document browser Panta Rhei~\cite{panta:on,MuKo:pr07}. |
| | 169 | The last part of this project regarded the extension of the interface with another service that would allow the dynamic loading of other services, thus providing even more freedom of configuration on the user side, leading towards more personalized active mathematical documents. |
| | 170 | |
| | 171 | This extension first adds a text at the top of the document (``Click me to configure the loaded modules'') and uses the same jQuery UI dialog (as one can see in figure \ref{modules}), only that this time, the dialog is made modal: everything else except the dialog is grayed out and it does not allow access to the underlying document until either the form is confirmed (via the \emph{Ok} button) and the necessary modules are loaded or the dialog is closed via the \textbf{x} button. In addition to that, we imagine students that might access a document or documents on the same domain for multiple times and having to load the same modules over and over might become irritating and annoying. Therefore, in addition to loading the necessary services, this module also stores the loaded services in a cookie for further usage and each time a page is loaded, the cookie is retrieved and the modules that have been loaded at the last access of the web page are loaded again. The list of available services is not static, but rather dynamic and each time the top of the page is clicked, a request is sent to the server, asking for the available services. This can be further expanded in a server-side saved user profile that can be used in other projects as the adaptive document browser Panta Rhei~\cite{panta:on,MuKo:pr07}. |
| | 172 | |
| | 173 | |
| | 174 | \section{Example test case} |
| | 175 | In the following section we present an example workflow for a test document. The user arrives at the test document and no services (besides the service loading system) are loaded, resulting in no obvious functionality. Once the user clicks the top of the page text which allows the loading of additional modules, a request is sent to the server asking for the available services, the dialog is populated and pops up and allows the user to check the \emph{wolframalpha} checkbox (see figure \ref{modules}). |
| | 176 | \begin{figure}[htp] |
| | 177 | \centering |
| | 178 | \includegraphics[width=\textwidth]{./img/modules.png} |
| | 179 | \caption{\small{The user loads the \emph{wolframalpha} service}} |
| | 180 | \label{modules} |
| | 181 | \end{figure} |
| | 182 | Once both the \emph{wolframalpha} and the additional \emph{folding} services are loaded, the user proceeds to the document and after each right click is presented with a contextual menu, dynamically created for that document element. Assuming the user would right click on a mathematical fragment which is $\sqrt{x}$, with the associated XHTML fragment presented in figure \ref{root_ML} which contains both Presentation MathML and annotations in OpenMath format, he would then receive a visual confirmation of his action via a context menu, as one can see in figure \ref{root_menu}. |
| | 183 | \begin{figure}[htp] |
| | 184 | \centering |
| | 185 | \includegraphics[width=0.5\textwidth]{./img/root_menu.png} |
| | 186 | \caption{\small{The user performs a right on the $\sqrt{x}$ symbol}} |
| | 187 | \label{root_menu} |
| | 188 | \end{figure} |
| | 189 | \begin{figure}[htp] |
| | 190 | \hspace*{-1.7cm} |
| | 191 | \centering |
| | 192 | \includegraphics[width=1.2\textwidth]{./img/root_ML_OM.png} |
| | 193 | \caption{\small{The Presentation MathML and the associated OpenMath content annotation representations of $\sqrt{x}$}} |
| | 194 | \label{root_ML} |
| | 195 | \end{figure} |
| | 196 | If a user were to access the \wa website and search for the Mathematica representation of the OpenMath fragment, in this case \emph{Sqrt[x]}, the result page would look like figure \ref{root_website}. After the request is processed, the \wa content is retrieved on the client side and the user will experience something resembling figure \ref{root_wa}. |
| | 197 | |
| | 198 | \begin{landscape} |
| | 199 | \begin{figure}[htp] |
| | 200 | \centering |
| | 201 | \includegraphics[width=1.6\textwidth]{./img/root_wa.png} |
| | 202 | \caption{\small{Part of the \wa results embedded into the original document}} |
| | 203 | \label{root_wa} |
| | 204 | \end{figure} |
| | 205 | \end{landscape} |
| | 206 | |
| | 207 | \begin{figure}[htp] |
| | 208 | \centering |
| | 209 | \includegraphics[width=\textwidth]{./img/root_website.png} |
| | 210 | \caption{\small{Part of the \wa website search results for \emph{Sqrt[x]}}} |
| | 211 | \label{root_website} |
| | 212 | \end{figure} |