Lucene3.6 之查詢篇

jopen 11年前發布 | 14K 次閱讀 Lucene 搜索引擎
1、BooleanQuery
lucene3.6中BooleanQuery 實現與或的復合搜索
BooleanClause用于表示布爾查詢子句關系的類，包括：BooleanClause.Occur.MUST，BooleanClause.Occur.MUST_NOT，BooleanClause.Occur.SHOULD。必須包含,不能包含,可以包含三種.有以下6種組合：

1．MUST和MUST：取得連個查詢子句的交集。
2．MUST和MUST_NOT：表示查詢結果中不能包含MUST_NOT所對應得查詢子句的檢索結果。
3．SHOULD與MUST_NOT：連用時，功能同MUST和MUST_NOT。
4．SHOULD與MUST連用時，結果為MUST子句的檢索結果,但是SHOULD可影響排序。
5．SHOULD與SHOULD：表示“或”關系，最終檢索結果為所有檢索子句的并集。
6．MUST_NOT和MUST_NOT：無意義，檢索無結果。
示例代碼
public static void query(String path,String keyword,int size){

    try {
        File file = new File(path);
        Directory mdDirectory = FSDirectory.open(file);
        Analyzer analyzer = new IKAnalyzer();

//          Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36);

        IndexReader reader = IndexReader.open(mdDirectory);

        IndexSearcher searcher = new IndexSearcher(reader);

        String[] fieldName = { "title", "category" };   // (在多個Filed中搜索)
        QueryParser queryParser = new MultiFieldQueryParser(
                Version.LUCENE_36, fieldName, analyzer);
        Query q1 = queryParser.parse(keyword);

        QueryParser parser = new QueryParser(Version.LUCENE_36, "author", analyzer);
        Query q2 = parser.parse("周偉明");

        BooleanQuery boolQuery = new BooleanQuery();
        boolQuery.add(q1, BooleanClause.Occur.MUST);
        boolQuery.add(q2,BooleanClause.Occur.MUST);

        ScoreDoc[] docs = searcher.search(boolQuery, null, size).scoreDocs;

        for (int i = 0; docs != null && i < docs.length; i++) {

            Document doc = searcher.doc(docs[i].doc);

            int id = Integer.parseInt(doc.get("id"));
            String title = doc.get("title");
            String author = doc.get("author");
            String publishTime = doc.get("publishTime");
            String source = doc.get("source");
            String category = doc.get("category");
            float reputation = Float.parseFloat(doc.get("reputation"));

            Book book = new Book();

            book.setId(id);
            book.setTitle(title);
            book.setAuthor(author);
            book.setPublishTime(publishTime);
            book.setSource(source);
            book.setCategory(category);
            book.setReputation(reputation);

            System.out.println(book);
        }

        reader.close();
        searcher.close();

    } catch (CorruptIndexException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (ParseException e) {
        e.printStackTrace();
    }

}</pre><br />

2、TermQuery

詞條查詢，通過對某個詞條的指定，實現檢索索引中存在該詞條的所有文檔。


@Test
    public void testTermQuery(){
        try {
            String path = "D://LuceneEx/day02";
            String keyword = "android";
            File file = new File(path);
            Directory mdDirectory = FSDirectory.open(file);

        IndexReader reader = IndexReader.open(mdDirectory);

        IndexSearcher searcher = new IndexSearcher(reader);

        TermQuery query = new TermQuery(new Term("title", keyword));

        TopDocs tops = searcher.search(query, null, 50);

        int count = tops.totalHits;

        System.out.println("totalHits=" + count);

        ScoreDoc[] docs = tops.scoreDocs;

        for (int i = 0; i < docs.length; i++) {

            Document doc = searcher.doc(docs[i].doc);

            float score = docs[i].score;

            int id = Integer.parseInt(doc.get("id"));
            String title = doc.get("title");
            String author = doc.get("author");
            String publishTime = doc.get("publishTime");
            String source = doc.get("source");
            String category = doc.get("category");
            float reputation = Float.parseFloat(doc.get("reputation"));

            System.out.println(id + "\t" + title + "\t" + author + "\t"
                    + publishTime + "\t" + source + "\t" + category + "\t"
                    + reputation+"\t"+score);
        }

        reader.close();
        searcher.close();

    } catch (CorruptIndexException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}</pre><br />







3、TermRangeQuery


范圍查詢，這種范圍可以是日期，時間，數字，大小等等。可以使用"context:[a to b]"（包含邊界）或者"content:{a to b}"（不包含邊界） 查詢表達式  

示例代碼


@Test
    public void testTermRangeQuery(){

    try {
        String path = "D://LuceneEx/day01";
        File file = new File(path);
        Directory mdDirectory = FSDirectory.open(file);

        IndexReader reader = IndexReader.open(mdDirectory);

        IndexSearcher searcher = new IndexSearcher(reader);

        String fieldName = "publishTime";   
        //查詢出版日期在 "2011-04" 到 "2011-07" 之間的書籍
        TermRangeQuery tq = new TermRangeQuery(fieldName, "2011-04", "2011-07", false, true);

        TopDocs tops = searcher.search(tq, null, 10);
        int count = tops.totalHits;

        System.out.println("totalHits="+count);

        ScoreDoc[] docs = tops.scoreDocs;

        for(int i=0;i<docs.length;i++){
            Document doc = searcher.doc(docs[i].doc);

            int id = Integer.parseInt(doc.get("id"));
            String title = doc.get("title");
            String author = doc.get("author");
            String publishTime = doc.get("publishTime");
            String source = doc.get("source");
            String category = doc.get("category");
            float reputation = Float.parseFloat(doc.get("reputation"));

            System.out.println(id+" "+title+" "+author+" "+publishTime+" "+source);
        }

        reader.close();
        searcher.close();

    } catch (CorruptIndexException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}</pre> <p></p>

 

4、PrefixQuery

搜索以指定字符串開頭的項的文檔。當查詢表達式中的短語以"*"結尾時，QueryParser的parse函數會為查詢表達式項創建一個PrefixQuery對象。

示例代碼


@Test
    public void testPrefixQuery(){

    try {
        String path = "D://LuceneEx/day01";
        File file = new File(path);
        Directory mdDirectory = FSDirectory.open(file);

        IndexReader reader = IndexReader.open(mdDirectory);

        IndexSearcher searcher = new IndexSearcher(reader);

        String fieldName = "source";    
        Term prefix = new Term(fieldName, "清華大學");
        PrefixQuery preq = new PrefixQuery(prefix );

        TopDocs tops = searcher.search(preq, null, 10);
        int count = tops.totalHits;

        System.out.println("totalHits="+count);

        ScoreDoc[] docs = tops.scoreDocs;

        for(int i=0;i<docs.length;i++){
            Document doc = searcher.doc(docs[i].doc);

            int id = Integer.parseInt(doc.get("id"));
            String title = doc.get("title");
            String author = doc.get("author");
            String publishTime = doc.get("publishTime");
            String source = doc.get("source");
            String category = doc.get("category");
            float reputation = Float.parseFloat(doc.get("reputation"));

            System.out.println(id+" "+title+" "+author+" "+publishTime+" "+source);
        }

        reader.close();
        searcher.close();

    } catch (CorruptIndexException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}</pre><br />




5、PhraseQuery

短語查詢，默認為完全匹配，但可以指定坡度（Slop，默認為0）改變范圍。比如Slop=1，檢索短語為“電臺”，那么在“電臺”中間有一個字的也可以被查找出來，比如“電視臺”。  查詢表達式可以為“電 臺 ~1”

示例代碼


@Test
    public void testPhraseQuery(){

    try {
        String path = "D://LuceneEx/day01";
        File file = new File(path);
        Directory mdDirectory = FSDirectory.open(file);

        IndexReader reader = IndexReader.open(mdDirectory);

        IndexSearcher searcher = new IndexSearcher(reader);

        String fieldName = "title";     
        PhraseQuery query = new PhraseQuery();
        query.add(new Term(fieldName,"Lucene"));
        query.add(new Term(fieldName,"入門"));

//          query.setSlop(1);

        TopDocs tops = searcher.search(query, null, 50);
        int count = tops.totalHits;

        System.out.println("totalHits="+count);

        ScoreDoc[] docs = tops.scoreDocs;

        for(int i=0;i<docs.length;i++){
            Document doc = searcher.doc(docs[i].doc);

            int id = Integer.parseInt(doc.get("id"));
            String title = doc.get("title");
            String author = doc.get("author");
            String publishTime = doc.get("publishTime");
            String source = doc.get("source");
            String category = doc.get("category");
            float reputation = Float.parseFloat(doc.get("reputation"));

            System.out.println(id+" "+title+" "+author+" "+publishTime+" "+source);
        }

        reader.close();
        searcher.close();

    } catch (CorruptIndexException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}</pre><br />




6、FuzzyQuery

模糊查詢使用的匹配算法是levensh-itein算法。此算法在比較兩個字符串時，將動作分為3種：加一個字母（Insert），刪一個字母（Delete），改變一個字母（Substitute）。  編輯距離能夠影響結果的得分,編輯距離越小得分越高.查詢表達式為"fuzzy~",使用~來表示模糊查詢。

示例代碼


@Test
    public void testFuzzyQuery(){

    try {
        String path = "D://LuceneEx/day01";
        File file = new File(path);
        Directory mdDirectory = FSDirectory.open(file);

        IndexReader reader = IndexReader.open(mdDirectory);

        IndexSearcher searcher = new IndexSearcher(reader);

        String fieldName = "category";
        Term term = new Term(fieldName, "云計算");
        FuzzyQuery query = new FuzzyQuery(term, 0.1f);

//          FuzzyQuery query = new FuzzyQuery(term, 0.1f,1);

        TopDocs tops = searcher.search(query, null, 50);
        int count = tops.totalHits;

        System.out.println("totalHits="+count);

        ScoreDoc[] docs = tops.scoreDocs;

        for(int i=0;i<docs.length;i++){
            Document doc = searcher.doc(docs[i].doc);

            int id = Integer.parseInt(doc.get("id"));
            String title = doc.get("title");
            String author = doc.get("author");
            String publishTime = doc.get("publishTime");
            String source = doc.get("source");
            String category = doc.get("category");
            float reputation = Float.parseFloat(doc.get("reputation"));

            System.out.println(id+" "+title+" "+author+" "+publishTime+" "+source+" "+category);
        }

        reader.close();
        searcher.close();

    } catch (CorruptIndexException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}</pre><br />




7、WildcardQuery

通配符查詢，“*”號表示0到多個字符，“？”表示單個字符。 最好不要用通配符為首，否則會遍歷所有索引項


@Test
    public void testWildcardQuery(){

    try {
        String path = "D://LuceneEx/day01";
        File file = new File(path);
        Directory mdDirectory = FSDirectory.open(file);

        IndexReader reader = IndexReader.open(mdDirectory);

        IndexSearcher searcher = new IndexSearcher(reader);

        String fieldName = "title";

        Term term = new Term(fieldName, "lucene*");

        WildcardQuery query = new WildcardQuery(term);

        TopDocs tops = searcher.search(query, null, 100);
        int count = tops.totalHits;

        System.out.println("totalHits="+count);

        ScoreDoc[] docs = tops.scoreDocs;

        for(int i=0;i<docs.length;i++){
            Document doc = searcher.doc(docs[i].doc);

            int id = Integer.parseInt(doc.get("id"));
            String title = doc.get("title");
            String author = doc.get("author");
            String publishTime = doc.get("publishTime");
            String source = doc.get("source");
            String category = doc.get("category");
            float reputation = Float.parseFloat(doc.get("reputation"));

            System.out.println(id+" "+title+" "+author+" "+publishTime+" "+source+" "+category);
        }

        reader.close();
        searcher.close();

    } catch (CorruptIndexException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}</pre><br />




8、SpanQuery

SpanQuery：跨度查詢。此類為抽象類。  

SpanTermQuery：檢索效果完全同TermQuery，但內部會記錄一些位置信息，供SpanQuery的其它API使用，是其它屬于SpanQuery的Query的基礎。  

SpanFirstQuery：查找方式為從Field的內容起始位置開始，在一個固定的寬度內查找所指定的詞條。  

SpanNearQuery：功能類似PharaseQuery，SpanNearQuery查找所匹配的不一定是短語，還有可能是另一個SpanQuery的查詢結果作為整體考慮，進行嵌套查詢。  

SpanOrQuery：把所有SpanQuery查詢結果綜合起來，作為檢索結果。  

SpanNotQuery：從第一個SpanQuery查詢結果中，去掉第二個SpanQuery查詢結果，作為檢索結果。  

 

示例代碼


@Test
    public void testSpanQuery(){

    try {
        String path = "D://LuceneEx/day01";
        File file = new File(path);
        Directory mdDirectory = FSDirectory.open(file);

        IndexReader reader = IndexReader.open(mdDirectory);

        IndexSearcher searcher = new IndexSearcher(reader);

        String fieldName = "title";

        Term t1=new Term(fieldName,"權威");
        Term t2=new Term(fieldName,"lucene");
        Term t3=new Term(fieldName,"搜索");
        Term t4=new Term(fieldName,"出版社");

        SpanTermQuery q1=new SpanTermQuery(t1);
        SpanTermQuery q2=new SpanTermQuery(t2);
        SpanTermQuery q3=new SpanTermQuery(t3);
        SpanTermQuery q4=new SpanTermQuery(t4);

        SpanNearQuery query1=new SpanNearQuery(new SpanQuery[]{q1,q2},1,false);
        SpanNearQuery query2=new SpanNearQuery(new SpanQuery[]{q3,q4},3,false);
        SpanNotQuery query = new SpanNotQuery(query1, query2);


//            Term t =new Term("content","mary");
//            SpanTermQuery people = new SpanTermQuery(t);
//            SpanFirstQuery query = new SpanFirstQuery(people,3);//3是寬度

        TopDocs tops = searcher.search(query, null, 100);
        int count = tops.totalHits;

        System.out.println("totalHits="+count);

        ScoreDoc[] docs = tops.scoreDocs;

        for(int i=0;i<docs.length;i++){
            Document doc = searcher.doc(docs[i].doc);

            int id = Integer.parseInt(doc.get("id"));
            String title = doc.get("title");
            String author = doc.get("author");
            String publishTime = doc.get("publishTime");
            String source = doc.get("source");
</pre> 來自：http://blog.csdn.net/fx_sky/article/details/8543146
本文由用戶 jopen 自行上傳分享，僅供網友學習交流。所有權歸原作者，若您的權利被侵害，請聯系管理員。
轉載本站原創文章，請注明出處，并保留原始鏈接、圖片水印。
本站是一個以用戶分享為主的開源技術平臺，歡迎各類分享！
本文地址：http://www.baiduhome.net/lib/view/open1392173494004.html
Lucene 搜索引擎
Lucene3.6 之查詢篇

相關經驗

相關資訊

相關文檔

目錄

Lucene3.6 之 查詢篇

相關經驗

相關資訊

相關文檔

目錄

Lucene3.6 之查詢篇