Project: Japanese Audio Quiz - Part 2

Project: Japanese Audio Quiz - Part 2

In the first part of this project blog, I set out the goal of creating an Audio based webpage for learning the Hiragana Japanese alphabet. While there are several typing based reinforcement quizzes online, the options for audio based call and response quizzes are fairly limited. After doing some research however, we saw that several of these writing based quizzes have some great design ideas that could carry over well to an audio version.

Next I am going to talk about how I ended up building the functionality for this site. To help the project on it's way, I am going to be borrowing the audio assets from the learning Japanese site, and using some of the HTML and CSS from the Lexi-logos keyboard website; both of which are excellent resources. Here is a sneak peak at the direction that the project ends up heading:

Full keyboard Screen shotRandom Key Screen Shot


There are a number of steps that got me to this point. First, I had to flush out the basic interface of the page from my starting template. Next I would need to be able to connect the audio and image assets I would be using. From there I start building basic functionality on top of the page using JavaScript, and finally take care of some more advanced design features such as tweaking user feedback aesthetics and improving the learning/reinforcement experience for the user.

Building the Foundation

I started by stripping out the elements I didn't need from the lexilogos page source code, like the header, footer, and navigation elements specific to their site. As cool as their instantly translated text area was, I eventually removed it because it was taking up to much screen space. There was also a few extra rows of buttons with punctuation or accents I wouldn't need either. Freeing up all that space then allowed me to bump up the size of the buttons so they would be easier to read and press.

The biggest thing I got out of this was the keyboard structure with all of the Hiragana Unicode characters already inserted and organized. This saved a lot of time rather then trying to build it all up from scratch, and hunt down the uni-code characters on my own. At one point I had considered trying to find image files for all of the Hiragana kana, but this solution was much more elegant.

Lexilogos keyboards also includes the phonetic English letters underneath each kana button, but the point of creating an Audio quiz is so that the student could get away from the dependance on using English characters to learn a new alphabet. Rather then remove them outright however, I decided to create an option where users could hide/show the English letters. This way you could use the webpage even if you were just seeing the kana for the first time, then gradually back off using the English phonetics as you got more familiar with them.

My original naive approach to do this was to use a javascript function to go through and set the hidden attribute for those rows using either a class or id selector; in fact several initial Google search results had me headed down this road for a while. As it turns out however, there was a much cleaner CSS function for doing this that could attach directly to the button OnClick event, by simply using the command $('.Keyboard').toggle()

Our last major structure issue involved implement one of the RealKana design features, which allow users to customize which sets of kana they wanted to be tested on. As mentioned back in part 1, each set of five kana are basically a constant combined with the five vowels. For example, the second row of kana represent the sounds for 'Ka', 'Ki', 'Ku', 'Ke', and 'Ko'. So in the keyboard table, I added an extra row of check boxes, which we would later be used in the JavaScript logic to build a valid set of kana audio files to pull from.


Adding randomly selected sound files to the page was a iterative process. My first experiments grabbed a random audio file out of the folder, and then inserted it onto a previously empty HTML element on the page. Grabbing a new random audio file would then overwrite the HTML in that element each time.

I ended up using the HTML5 style of audio embedding since I found it easier to work with for now, and since I'm not currently focusing on cross Browser compatibility. The HTML5 default embedded controls made it relatively easy to replay the short audio files, and also happened to look good on the page as well.

Eventually, I used the check boxes from before to build an array of English phonetic kana to draw from. So instead of drawing a random audio file name out of a folder, I could dynamically build that file location using a naming convention based on those letters. This same naming convention would then be used later on to check if the user selected the correct kana button for the audio file, and to build a random list of kana including the one in the currently loaded audio file.

User Feedback

Now that we can summon forth an audio file and match it with a specific button push, we want to tell the user whether or not they selected the correct answer. In the case that they get it right, we would then want to give them a new audio file to solve.

For checking the users answer, I attached a new JavaScript function to all of the button OnClick events. The function would then compare the value of the button they pressed (passed by the function), to a value that is embedded in the audio file HTML each type a new audio file is called. At this point, we would need to give the user some type of feedback based on their selection.

A quick Google image search gave me some good resources for correct/wrong answer images, in this case a thumbs up or thumbs down. Rather then constantly overwriting the HTML element as I had been doing with the audio files (which admittedly may not be the best approach if taken to scale, but works well enough for personal use as a prototype), I included them as hidden images embedded and centered just under the audio file element.

Each Positive or Negative response should appear for a short amount of time before disappearing (and selecting a new audio file in the case of correct answers.) which I initially implemented with a default timeout value. But I also decided to get a bit fancy with CSS and add a short fade in and fade out effect to the images to make things look a lot smoother.

There were a number of potentially ways to schedule these fade effects including using more timeout functions, though that might start go get a bit ugly. I was particularly thrilled to find yet another elgent CSS solution however; where it was possible to chain CSS commands in my JavaSCript function. The resulting line of code was simplified to something along the lines of $('#Correct').fadeIn().delay(1000).fadeOut();

However I anticipated there might be issues with this sort of implementation, so I made sure to test out several cases before moving on.This revealed several issues with this current implementation including:

  1. The user could end up with both the Correct and Incorrect button up at the same time...
  2. The user could end up 'queueing' several fade in effects, that would appear one after another...
  3. The user could also end up queueing multiple sound file request...

I addressed these issues in two steps. First, in either the Correct or Incorrect answer cases, I had to stop any existing animations on both image files and hide them immediately. This would prevent the cases of both images showing up at the same time, or multiple queuing effects being queued.

The second step involved changing how I had implemented the timeout before requesting a new audio file. Instead of inserting the timeout, and then the function call; I created a time out function set to the same duration. Then I would make sure to clear the timeout for that function before calling it again. So in the case of a user spamming the correct answer over and over it should now only request a new audio file once.

Inserting some randomness

While the Hiragana Alphabet has a nice structure for learning the alphabet through sets of characters, this also provides a design problem for the learning process. Namely, the learner substituting positional memorization rather than symbol association.

Because the kana are always laid out the same, that means the third character in a row will always be the 'U' sound for that consonant. So given the audio queue "Tu", if you know which set of characters are the "T" characters, then you know to choose the third one in that set even if you don't remember what the kana looks like. If the student is suddenly presented with the same kana outside of the context of the alphabet structure, they could suddenly find themselves unable to recognize it.

This is why when memorizing the Hiragana alphabet, one of the tricks I used was to try and write the letters in a different order (Reverse order, backwards, 'u' letters first, ect). For the case of my online quiz, this meant I wanted an option to replace the full Hiragana keyboard with a selection of kana chosen randomly and presented outside of the context of the full alphabet.

Ideally this creates a natural progression as learners can first use the full keyboard with English while memorizing the characters; toggle to the full keyboard without English labels to reinforce it; and then switch over to the random button mode to drill what they have learned.

As with my audio file iteration, my first attempts involved drawing random characters from the entire Hiragana alphabet. Then using them to build the HTML for a number of buttons, attaching the proper function and values based on their character, and setting that HTML string as the innerHTML of a designated element. Future iterations eventually connected this function to the request audio file function, so I could pass the list of currently selected kana sets as well as the value of the currently loaded audio file to make sure the correct response was always in the list of buttons.

During this process, I made sure that it would continue to re-draw random kana to avoid duplicates in the button list. There is a limit on how many times it tries to do this to prevent hanging, in case there end up being any edge cases where there are more buttons being generated then there are kana in the current set.

I also added the ability for the page to dynamically increase the number of random buttons it would create, depending on the size of the currently valid kana. Specifically, by default the JavaScript will attempt to create four buttons, but it will also include an additional button for every twelve characters in the passed valid kana set. This means if you are testing three sets of buttons at once, you will have to choose from among five buttons instead of four; all the way up to nine buttons if your testing yourself on the entire alphabet.

In dynamically creating the HTML for the random buttons, this means I had to include the uni-code character for the Hiragana symbol of that kana as well. To accomplish this, I again took advantage of the Lexilogos source code by modifying the script they use to do the character replacement in their text box area. It's not that this is an overly complex thing to do; it simply involves a huge case statements to deal with each of the 60+ some odd character conversions. Being able to leverage their code which did near exactly this saved a lot of tedium.

Forming a Queue

When testing with the prototype, one of the quirks I noticed when you were quizzing a single set was the frequency of how often certain kana appeared. It was possible to get the same audio file several times in a row, but more importantly it was possible to go a long time without hearing a specific kana from the set. I didn't mind seeing some of the kana twice before seeing everything in the set at least once, but in the case of testing small sets I wanted to limit how often this happens.

The sophisticated solution would probably be to create a weighted value attached to the likelihood of selecting each character in the set. You could do this with a double array, although then it would raise some questions as to what to do with weights if you started selecting and deselecting new sets while testing. Or you could enter multiple copies / remove copies of the kana from the array; but if left unbounded the array could get quite huge if you kept at the page for a long time.


Instead, I opted for a simpler approach that aimed to prevent seeing the same repeated characters to often while testing smaller sets.To accomplish this, rather then selecting a single kana to call an audio request, I filled a short array with four or five kana to act as a queue. So for the first, and every subsequent time the request audio function is called; it makes sure to fill up the queue to a designated size by calling a fillQueue function. The default queue size variable is four, though it would be easy to change or even dynamically resize it like we did for the random buttons.

The first thing the queue function does is check the existing 'queue' to see if any of the charecters in the array are no longer in the list of currently valid/tested kana; and removes them if so. Then it starts adding new random characters to the end of the array from the currently valid list, until it reaches the correct length. After the queue is returned, the request audio function then pops off the top element in the array using shift(), reducing the size of the array by 1, and using that value to build the audio file name as before.

Smarter Learning

The Queue allowed me to implement another 'advanced' feature; smart reviewing. The Anki flash card system mentioned in the previous post has an incredibly robust review function. If you get a prompt wrong, then it will present you with that same prompt again shortly. Questions you get wrong will show up more frequently in your schedule, while questions you get right can be told to show up less frequently depending on how hard of a time you had answering them and how many times you had previously answered it correctly. The Anki system however, is entirely based on self assessment.

Currently my web page doesn't take advantage of any cookies or HTML5 web storage. While I am sure there is some interesting research on algorithms or scheduling approaches to maximize the effects of retention and repitition, my implementation choices so far are a bit limited. I decided instead to go with a solution I have seen used by another 'flash card / drilling' website,

Free Rice is a vocabulary website which started off focusing on SAT words and second language vocabulary; though latter they added other subjects such as the periodic table or famous paintings. Free Rice incentives students to study by showing that each answer they get correct will cause their sponsors to contribute 10 grains of rice to combat world hunger. As learners continue to get answers correct they start seeing the amount of rice they are helping to donate stack up. The site is responsible for donating several billion grains of rice a year.

The simple approach they take when a student gets a question wrong is to re-visit the question exactly three questions later. It isn't terribly fancy but it is effective and easy to implement. So now when a student selects the wrong answer for a prompt on my site, the function that is activating the fade in for the 'thumbs down image' also replaces the third element in the current queue with the kana they just got wrong. Since we just took this kana off the front of the queue, inserting it back in shouldn't cause any duplicates.

The only difference between my method and free rice (aside from the fact that I'm not currently combating world hunger..) is that my prompts won't continue on to the next one until the student selects the right response. The free rice method moves on to the next question after an incorrect response, so students don't get a chance to try again right away. It would be interesting to see which of those approches would be more effective for retention.

Future Wishlist

At this point, my Learn Hiragana website is pretty robust and fully functional for personal use. But that doesn't mean the project is necessarily complete. Aside from efficiency and cross browser compatibility changes, there are a few other ideas I've had for advanced features for learning Japanese...

The next step would be to include a list of audio files for Japanese words. The idea would be that users would hear a word, and then have to select the letters that make up that word in order.

This isn't meant to act as vocabulary drilling, though it certainly can be an introduction and eventually segue into that. The idea is that it is basically ear training so students can get used to hearing what the letters sound like in context. Similar to the way that the random button mode separates the kana association from the physical location in the Alphabet structure, a word prompt would start putting the sounds into the context in which learners are likely to hear them in real life.

The basic premiss would be that I would collect a large list of words, then build word queues out of all of the words which contained letters only in the currently checked sets. That of course would probably involve a whole lot of buckets or grep commands to build word list like that. It would also require quite a few audio files as well...

Copy Right

My largest concerns right now is whether or not I should post this work I've done for my personal use on my site; due to the amount of resources I have borrowed from other sites. I make a big deal about trying to properly credit my sources, and have tried to be up front about that here as well as adding an attribution section to the bottom of my page even when it hasn't been publicly released. Even when I have no intent to monetize the work.

If it was just the code I borrowed from lexilogos to alleviate some of the tedious work of building the massive tables/case statements then I wouldn't feel to bad about putting it up and sending them an email to make sure they don't have a problem with it. I've built a lot of the functionality to make it do what I wanted it to do from scratch, and creating those tables on my own would probably have looked almost exactly the same.

But the problem is the large amount of audio assets I ended up borrowing from and the Anki Hiragana deck. I would like to include more of my projects on the site, but I really don't want to step on copyright issues here and make sure what I do copy I do right by the people whose I have appreciated and which has inspired me.


Still, at least I've managed to point some people towards some excellent resources on learning languages, and given some insight into the design considerations around building out a project like this. If I ever do end up posting the webpage itself, I hope people will find it useful for their own learning purposes as well.




One thought on “Project: Japanese Audio Quiz - Part 2

  • This sounds really cool! I want to use it right now! I especially like your ideas on making a Q for upcoming tests and using that to repeat failed tests more often. I also think that hearing sounds within a word is definitely different than hearing them alone, so I think the word bank (especially if filtered for only sounds you are currently studying) is a really strong idea. I hope that one day you can sort out copyright stuff so some of us can play with it!

Leave a Reply

Your email address will not be published. Required fields are marked *