-
Home
last modified November 2, 2008 by russf
Overview 
Plone and Zope find diverse applications, many of them related to document management and presentation.
There are significant opportunities to use existing and well understood machine learning and text classification techniques to help categorize and group documents, and improve the quality of document metadata within Plone.
Goals
Add a simple framework for defining classifiers to help generate metadata, and to group content. The underpinnings should eventually be useful for Zope3 as well as Plone. We would like our first customer to be the Plone Help Center on Plone.org, to facilitate selection of optimal keyworks, sections, etc.
Planning
The current plan is a result of the classifier sprint at the Arlington Career Centre , near Washington DC, October, 2008
- complete the recipe for building svm on OSX etc. (depends on some changes to SWIG invocation??)
- complete the clarity.classifier package
improve the preprocessor (lower case, remove singletons, remove high frequency words)Committed revision 74887add testsCommitted revision 74887- improve classifier parameters for simple tests, and provide a corpus for learning.
- add a Plone integration package that uses the classifer and recommends keywords and other metadata based on existing content, with a goal of more consistently tagging articles in the Plone Help Center.
Links
SVM - the technology for the first classifier implementation
Developers
Getting started:
svn co https://svn.plone.org/svn/collective/clarity.classify cd clarity.classify python bootstrap.py bin/buildout bin/instance test -s clarity.classify