Challenges to Web Mining
Web: A huge, widely-distributed, highly heterogeneous, semi-structured, interconnected, evolving, hypertext/hypermedia information repository.
- the “abundance” problem
- limited coverage of the Web (hidden Web sources)
- limited query interface: keyword-oriented search
- limited customization to individual users
- difficult to enforce standards
DBMS, DBers, and data miners will play an increasingly important role in the new generation of Internet