Confused about the tools you need for data mining? Ask
yourself these nine questions before you spend any
money.
1. Can existing data and infrastructure support the
proposed data-mining tool?
Some products will require a separate server for analysis, which
could increase the expenditure by adding server hardware and
software licensing costs. Other solutions operate only at the
desktop level, but may not scale well with large data sets.
2. Are there adequate data preparation
tools?
What tools are included in the data-mining solution to help
construct the data-mining database? Premining data preparation
entails considerable effort. Consider choosing a solution that
includes cleansing tools. Does that solution also offer
transformation, integration and load capabilities? These tools
simplify the creation of the data-mining database.
3. How is data accessed?
Some tools force data to be extracted from its source and put
into the solution's proprietary format. Others support direct
access to the data sources.
4. Which models are supported?
Some data-mining tools and solutions support only one or two
modelling types. Will this be enough to support analysis of your
business problems?
5. Can the solution be integrated with third-party
tools?
What type of support is provided to integrate other tools, such
as an Olap solution, into the data-mining environment? Or, is
integration support limited to the supplier's tools? How will this
mesh with the tools used now?
6. What's the mining output?
Data-mining results have to be decipherable to be worth time and
effort to obtain them. Consider tools that produce charts and
graphs directing you to take specific actions (e.g. new business
rules). What types of reporting does the data-mining solution
provide? Can results be exported to create reports using a
third-party product?
7. Model maintenance burdens?
What APIs (application programming interfaces) does the
data-mining solution support? Supported interfaces allow changes to
the model and get better results. A tool which lacks model
flexibility may require custom coding.
8. Scalability?
Is the solution designed for single- or multi-processor
environments? Is there a maximum number of data elements or are
data-mining activities only constricted by available memory,
processor, and disc?
9. Will users be satisfied?
Does the solution offer role-based access? Can an executive log
in and work with the data-mining interfaces as easily as a business
analyst? What will be required of the administrator? How much
training will be needed for administrators, analysts, and other
end-users?
Maggie Biggs writes for InfoWorld