Design Pattern: Chain of Command or Chain of Responsibility September 21, 2006Posted by shahan in software architecture.
add a comment
When programming, certain sections of code can sometimes be viewed as a workflow or preset sequence of tasks or commands. This can considered to be the design pattern called Chain of Command or alternatively, Chain of Responsibility. What this implies is that there is a container, the Context, which contains the objects processed by the workflow, which can be added and removed, by each Command in the Chain.
There are various models to define and implement these workflows, the easiest being writing the code as is (which isn’t much of a model). This introduces a level of formality in the workflow, any changes require source code changes and recompilation.
One alternative is to use a Java dynamic class loader (which contains the workflow) after some decision making process. With some small amount of creativity, this technique allows for new workflows to be created without having to modify and recompile the original source code, only the workflows. This may introduce undesirable runtime errors.
Another viable alternative is to use the Apache Chain library. The performance is very reasonable, the implementation, and configuration through XML files make it a treat to use. For performance, an empty Command (configured through the XML configuration file) was called for each word encountered in a set of 750 files through the use of a tokenizer, resulting in only an approximate 500ms processing delay (not counting startup costs or memory footprints, further results may follow).
Introduction to Information Retrieval (IR) September 21, 2006Posted by shahan in information retrieval.
add a comment
It all begins with parsing a document, tokenizing the words that do not appear on a blacklist (common words such as ‘the’, ‘it’), and attaching a rank to these words based on an algorithm.
There are several algorithms available, the gist of which is: if a word appears many times in a document, then it’s important. Algorithms will be covered later.
Once this information is obtained (a list of words and their importance), then they need to be linked to the documents in which they appear. For this, Inverted Indexes [Wikipedia] come in handy, though the representation in Wikipedia is one of a few different ways. Different representations will be covered later and there are enhancements available which will provide different services such as highlighting or taking advantage of document structure. Of course there are tradeoffs between performance and disk space which makes it all the more interesting.
CASCON in Toronto September 21, 2006Posted by shahan in conference.
add a comment
There are number of very interesting presentations at this cost-free conference. I suggest looking through whatever is available and visiting. Oct 16 – Oct 19, 2006. Toronto. Hilton Suites in Markham.
The ones that are particular to my interests are:
Tuesday Oct 17
Hands-On: Writing for the Web
Hands-on: Introduction to Social Computing Technologies
Tools for Managing Crosscutting Concerns
The 3rd Workshop on Engineering of Autonomic Software Systems
Wednesday Oct 18
Full Paper: Integrating Dynamic Views Using Model Driven Development
Social Computing: Best Practices
Sense and Response Applications
Thursday Oct 19
Keynote: Technology Leadership: Changing the World for Women and for Technology
Women in Technology: New Strategies to Attract a New Generation of Girls
Virtualization and the Management of Information Services
Social Computing: How the Social Web (R)evolution is Changing the Business Landscape
Welcome September 1, 2006Posted by shahan in random.
add a comment
Hello Everyone and Welcome,
A very brief intro to my first blog: It will contain content related to web information and it’s management. This may range from specifics in programming to generalizations about current trends.
Thank you for visiting. Comments are always welcome.