Stop-words are common words that do not
have so much meaning in a retrieval system. Stop-words are
a part of natural language with that a text miner will encounter. The reason
that stop-words should be removed from a text is that they make the text look
heavier and less important for analysts and the stop-words are not necessary
for the analysis and so we do get some data reduction by eliminating
stop-words. A
query done by using stop-words would have a weak ability to categorize the text
because of these words return each
element of the data set as a result (Adsiz, 2006). In enterprise
search, all stop words, for example, common words like a and the,
are removed from multiple word queries to increase search performance. There
is not one master too many list of stop words which all tools use. Any group of
words can be chosen as the stop words for a given purpose, depending on their
importance and data reduction needs. For
some search
machines, these are some of the most common, short function words,
such as the, is, at, which and on. In this case, stop words can
cause problems when searching for phrases that include them, particularly in
names such as 'The Who',
'The The', or 'Take That'. Other search
engines remove some of the most common words including lexical words, such as "want"—from query in order to improve performance (Stackoverflow, 2008).
1- Adsiz, A., (Ahmet
Yesevi University ). (2006). Dissertation: Text Mining.
2- Stackoverflow. (2008).
http://blog.stackoverflow.com/2008/12/podcast-32/
ReplyDeletewhen i am submitting my website in google search engine i am facing the problem. www.instamag.in
many of my page are not indexed...let me know how to over come this...
ReplyDeletewhen i am submitting my website in google search engine i am facing the problem. www.instamag.in
many of my page are not indexed...let me know how to over come this...