Text Data Mining on Mosha's blog
With the permission of the blog owner I have collected various blog posts since late 2004 and have excluded posts about subjects outside of core MDX discussions. This have been a copy and paste work into a single large text file of the text and not the HtML source tags. I have marked the main text in each blog entry of interest and have simply done a copy and paste into a text file.
Text data mining in SQL Server 2005 or 2008 starts with using SSIS and the Term Extraction and Term Lookup transforms in the data flow. You start by creating a dictionary of terms that you think are relevant. Create a data flow task in the control flow and in the data flow you build something like in the picture below.
Read more...Tags: data mining