waytopython

09 Jan, 2014

Whoosh:Pure Python search engine library Part-1

, — Posted by waytopython @ 07:32

Whoosh is a library of classes and functions for indexing text and then searching the index. It helps to develop custom search engines.

For example, if you were creating daily tasks software, you could use Whoosh to add a search function to allow users to search daily entries.

A very Basic Whoosh Example

1. Install Whoosh

pip install Whoosh

2. In terminal write following code

import os
from whoosh.index import create_in
from whoosh.fields import *
schema = Schema(title=TEXT(stored=True),path=ID(stored=True),content=TEXT)
if not os.path.exists("indexdirectory"):
os.mkdir("indexdirectory")
ix = create_in("indexdirectory", schema)
writer = ix.writer()
writer.add_document(title=u"First day task", path=u"/xxx",content=u"current month is january and today's task is buy a gift for brother on his birthday!")
writer.add_document(title=u"Other task", path=u"/yyy",content=u"Today's task is to work on python!")
writer.commit()
from whoosh.qparser import QueryParser
with ix.searcher() as searcher:
query = QueryParser("content", ix.schema).parse("birthday")
results = searcher.search(query)
results[0]

Output:
<Hit {'path': u'/xxx', 'title': u'First day task'}>

>>>> In next part we will be working in whoosh from a form in which text to search is to be given and the contents from which search to be made will be from database in django framework.

 

 


Comments

  1. Very interesting! What is the best known application for whoosh? Also what are the limitations? Does it need to keep all searched entries in memory?

    Posted by Eric — 15 May 2014, 11:46


Add comment

Add comment

authimage