• Xinha discussion

  • DOMwalk and TransformInnerHTML

    from nicholasbs on Dec 04, 2008 06:42 PM
    Hi all,
    
    There are two getHTML implementations: DOMwalk, which as its name suggests uses DOM elements, and TransformInnerHTML, which uses regexes for its cleanup. According to the comments, DOMwalk has been around longer, while TransformInnerHTML is a newer implementation designed to be faster. Is this right, and does anyone else know more about their history?
    
    How come we maintain both versions? Would it make sense to choose one and concentrate on making that one better, or are there strong reasons for supporting both? There are currently bugs in each that don't exist in the other (e.g., #1245, #1250).
    
    There is also the GetHtml plugin, which simply sets xinha_config.getHtmlMethod to 'TransformInnerHtml' when it's turned on (all of the actual work is done in modules/GetHtml). Since this can be done in a Xinha config file, I think this plugin should be reclassified as 'unsupported'. (Are there any good reasons not to do this? I assume the plugin exists in its current state purely for legacy reasons, correct?)
    
    Thanks,
      -Nick