Just before we were able to eventually say Goodbye to ugly 2016, I visited the 33rd Chaos Communication Congress in Hamburg. One evening, I chatted with a physicist. He asked me whether it was possible to use something like fuzzy answers with clozes in H5P. It wasn’t. But indeed, this might come in handy. Thanks for sharing the idea! Such a feature would be pretty neat in order to forgive typos. It could also be useful for accepting different spellings of names without explicitly stating all possible alternatives.
Well, what do you think I did after I returned home? 😉 As always, I cannot promise a date for a release. The guys over at Joubel are busy working on other cool new stuff for H5P and will have to find some time first for checking my contribution. Anyway, if you’re impatient („Patience is for wimps!„), you can get the code from github. Please note that you will also need the new library H5P.TextUtilities that I created to source some functions that may also be relevant for other interactivities.
The Levenshtein distance uses the number of operations necessary to transform one string into the other. Operations are deleting, inserting or exchanging a character. Damerau added swapping a character to the pool of operations. Consequently, the few operations you need for a transformation, the more similar the strings are.
The Jaro distance on the other hand „simply“ represents the similarity in percent. Winkler refined the algorithm a little bit for particular cases. I also found a paper from 2012 that suggests a more general improvement. Well, I need something to explore in the future, right? 😉
The users of H5P can, of course, use the options to tweak both algorithms a little bit. It’s hard to say in advance what the maximum number of operations for Damerau-Levenshtein or the minimum threshold for Jaro-Winkler should be. Also, the options might change. I just noticed that a discussion started whether an on/off switch might be more user friendly. Gotta go!