Here is a nifty bit that can help folks who build user interfaces (web, desktop or mobile). We seldom build UI that does not have text boxes. Text boxes, single line or multi line, allow users to enter details that are best suited as free text.
If you are a .NET developer then you would most probably be using DataAnnotation attributes on model properties that are bound with the UI elements. In case of free text, we end up using a RegularExpression Validators to enable automatic validation of user input.
In this particular case, Regular Expression is a white list of characters. And here is a catch. We end up creating regular expression using the characters typed on Visual Studio Text Editor which leaves out certain similar valid characters, mainly because they are not supported on the editor. Microsoft Word is one such case where the typed content is not same as similar content typed on notepad and therefore can cause unwanted behavior in system.
My colleague (Amit) helped me in figuring out the difference in character codes for the characters which look identical on surface.
Char Code Of – : 8211
Char Code Of ` : 8216
Char Code Of ’ : 8217
Char Code Of “ : 8220
Char Code Of ” : 8221
Char Code Of - : 45
Char Code Of ' : 39
Char Code Of ' : 39
Char Code Of " : 34
Char Code Of " : 34
So, next time onwards, when you create regular expression, think if you need to support content typed on or copied from MS word :)
No comments:
Post a Comment