Friday, April 26, 2013


Hi everyone!


This is Gbenga and Max, coming to you from Rice University in Houston. This semester we were tasked with building a media plugin on top of the OERPUB Connexions Importer. With the growth of online educational resources such as Khan Academy, Coursera, and Connexions, there’s been an explosion in available educational media. Authors are seeking new ways to deliver content that is educational and interactive which allows them to teach students in more natural and compelling ways.


To enable the scenarios above our goals were to:

  • Build a plugin for the Aloha editor which enables embedding of media elements from multiple sites through a simple interface
  • Provide media support for content from Youtube,  Vimeo, and SlideShare.
  • Embed media in a format compatible with Connexions to enable content sharing via remixable modules
  • Streamline the user experience for media insertion


We faced several technical challenges in providing a unified interface for a diverse array of content sources. Primarily we struggled with providing a unified search due to the differing levels of support provided by each source’s APIs. Currently, we provide the ability to search Youtube content but not Vimeo and SlideShare.  This limitation is in part due to the authentication requirements imposed by Vimeo and SlideShare for advanced API functionality. Supporting authentication would require server-side support and goes against the plug-and-play principle of editor plugins. Support for this feature may be added in the future if a convincing use case is found.


We were successful in providing a unified interface for inserting content from multiple sources based on a provided URL. A user can use the site’s own content discovery tools to find their desired content before simply copying the URL of that content into our media dialog. For a more comprehensive look at our media plugin, let’s go to the walkthrough below!


Media Plugin Walkthrough


Step 1. A typical user would begin by clicking on the video icon to bring up the media picker.
*



Step 2. After bringing up the media plugin dialog, the user has two choices: they can either enter a URL from one of the supported content sources to take advantage of our URL validation and parsing to embed video or slides, or perform a search of content sources and select from a list of matching results.

The screenshot below shows a URL from Youtube that has been copied and pasted into the URL textbox. Note that the text box glows GREEN or RED based on whether the importer recognizes the URL pattern and can parse video information from it.



The screenshot below shows the dialog when using the content search feature. In this case, we are displaying the results of a Youtube search for “khan academy physics.”



Step 3. Once a user is happy with their selection they can click ‘Insert’ to insert their content into their document, and have it immediately appear in the editor.
*



Step 4.  Once a user has completed editing their document, they can save their content to Connexions. Below is a preview of the Connexions module created from this walkthrough.


Future work

In the near future, we expect our plugin to be used to import, remix, and share physical science content from the Siyavula textbook. We also hope to convince resources like PhET and Minute Physics to adopt standardized APIs similar to Youtube's API.


Widespread adoption of a standard API will encourage broader access of their resources through the significant reduction in development cost. API adoption will also enable scenarios which are currently impractical(e.g. a universal search interface).

We have had a great time developing this feature and we expect that the media plugin's support will be expanded to include interactive demos such as the physics simulations from websites such as PhET and Minute Physics. We also expect to expand our unified search support to include Vimeo & SlideShare. Thanks for reading and feel free to check out our demo.

Monday, May 7, 2012

Connexions Importer

Since the beginning of January, I have been fortunate enough to work with Kathi Fletcher and an international team of developers on the Connexions' Importer Project. Funded by the Shuttleworth Foundation (http://www.shuttleworthfoundation.org/fellows/kathi-fletcher/) , the project aims to smooth the content sharing by creating tools which make it easier for content creators to contribute to open education resources(OER). The importer accomplishes this by converting content from various formats to cnmxl then html, making it easy for contributors to share their content with anyone through Connexions(cnx.org).

My contribution to the importer has been mostly in bug fixes which helped me to become more familiar with the project's code base while also being an active contributor. I started with a couple of transformation bugs and finally moved on to more complex bugs which required heavier fixes. They are listed below for your reading pleasure:

  1. CNXML Editor Server Error (Bug 61): Choosing the 'Edit CNXML' option in the importer's Advanced Mode caused an 'Internal Server Error'. I found the issue to be caused by a Unicode error where the cnxml was being interpreted as ASCII instead of Unicode. Adding a couple of lines of code to explicitly assure a Unicode interpretation fixed this issue.
  2. Mathml Rendering Error (Bug 26): HTML documents with extensive mathml failed to render after being imported. This error had a couple of causes. First, the mathml namespace necessary for the browser to correctly interpret mathml were not set in the HTML document. Second, the use of a modified Connexions specific mathjax script, based an older version mathjax, also caused the mathml interpretation errors. Explicitly adding the xml namespace and replacing the modified script with an updated mathjax script fixed the errors in Chrome, Firefox, and IE8. However, mathml still failed to load in IE9. I fixed this by adding a compatibility header which forced rendering in IE7 mode. More recently the folks at Mathjax have put out a new version of mathjax that thankfully has removed the need to do this. 
  3. Missing Images (Bug 137): Some embedded images in Google Docs fail to correctly upload through the importer. I investigated this issue with another intern and discovered that the failure mainly occurred due to a permissions issue. Images which are editable, such as PNG's, are located in a different part of Google which requires broader permissions. Although we haven't found a fix for this issue yet, we have been able to discover the source of the error which is always half the battle.
  4. Openoffice doc/docx to odt failures (Bug 123): My most significant contribution to date has been my fix for the Openoffice bug. Simultaneous doc/docx uploads would periodically cause Openoffice to hang, blocking any other doc/docx conversion request from completing (which is not entirely surprising because Openoffice was not built to handle multiple conversion requests). To remedy this initially, I modified the conversion pipeline to start Openoffice as a background process. The process listened on a port where it would receive and handle conversion requests from clients. I then wrote an additional script which assures that Openoffice would be listening when a conversion request is sent. However this only handled the case in which Openoffice crashes. It did not handle the case in which Openoffice hangs due to receiving simultaneous requests. Doing some research on how developers have solved this particular issue, I found that developers suggested using a Java based tool called JOD converter. JOD beautifully handles simultaneous requests by creating a pool of Openoffice processes which retrieve and handle requests from a global queue. It also provides additional features such as automatic restarts of Openoffice upon a crash, a task queue timeout, and being able to set the maximum number of queued requests.                                                                                                                   However because JOD is written in Java, I had to build some surrounding infrastructure to make it compatible with the importer's Python code base. Luckily, the creators provided a sample webapp capable of receiving HTTP requests while running locally on a tomcat server --essentially  making it virtually compatible with any language. I spent the next couple of weeks setting up JOD and then writing some python code to construct and send HTTP requests. Then I ran some benchmarks to compare my solution to running Openoffice as a daemon. I saw a 2.5x improvement in conversion times of simultaneous requests and of course saw no Openoffice freezes related to simultaneous requests. Finally, I wrote an install script to allow easy customization of JOD's features and to ease the integration of my solution into the current pipeline. Currently, I am working with another developer to test and verify my solution before finally moving on to integrate my solution into the importer's pipeline. After doing so, I will provide a github link to my python solution which I hope someone, who may be looking for a python JOD solution, will find helpful. 
The last couple of months have been quite exciting for me and I'm excited to see what new challenges this summer will bring.

Thanks for reading
-Gbenga

Update:

Here is a Dropbox link to my solution:
https://www.dropbox.com/s/8nt7dngi4e29zi4/JOD.zip