Sunday, April 20, 2014
Research on how search engines go about implementing blog search

Big time scientists discuss what's wrong with current blog search and how it should be done to work properly.
"We introduce resource selection techniques for blog site search and evaluate their performance. Further, we propose a “diversity factor” that measures the topic diversity of each blog site."
"In this paper, we propose probabilistic models for blog search and mining using two machine learning techniques, Latent Semantic Analysis (LSA) and Probabilistic Latent Semantic Analysis (PLSA). We implement the models in our database of business blogs, with the aim of achieving higher precision and recall".
"In this work we adapt a state-of-the-art federated search model to the feed retrieval task, showing a significant improvement over algorithms based on the best performing submissions in the TRE 2007 Blog Distillation task"
"In this work, we propose a novel post-indexing spam-blog (or splog) detection method, which capitalizes on the results returned by blog search engines. More specifically, we analyze the search results of a sequence of temporallyordered queries returned by a blog search engine, and build and maintain blog profiles for those blogs whose posts frequently appear in the top-ranked search results."
