Java搜索引擎 Apache Lucene 3.4.0 發布
Lucene是一套用于全文檢索和搜尋的開源程式庫,由Apache軟件基金會支持和提供。Lucene提供了一個簡單確強大的應用程式接口,能夠做全文索引和搜尋,在Java開發環境里Lucene是一個成熟的免費開放源代碼工具;就其本身而論,Lucene是現在并且是這幾年,最受歡迎的免費java資訊檢索程式庫。人們經常提到資訊檢索程式庫,就像是搜尋引擎,但是不應該將資訊檢索程式庫與網搜索引擎相混淆。
Lucene最初是由Doug Cutting所撰寫的,是一位資深全文索引/檢索專家,曾經是V-Twin搜索引擎的主要開發者,后來在Excite擔任高級系統架構設計師,目前從事 于一些INTERNET底層架構的研究。他貢獻出Lucene的目標是為各種中小型應用程式加入全文檢索功能。
Apache Lucene 3.4.0 發布了,該版本包含很多的bug修復,優化和改進,下載地址:
http://www.apache.org/dyn/closer.cgi/lucene/java
如果你正在使用 Apache Lucene 3.1, 3.2 or 3.3 ,那我們強烈建議你立即升級到該版本。
Lucene 3.4.0 主要亮點:
* 修復了一個主要的bug (LUCENE-3418) 該問題在操作系統或者計算機崩潰的時候會導致索引被破壞
* Added a new faceting module (contrib/facet) for computing facet counts (both hierarchical and non-hierarchical) at search time (LUCENE-3079).
* Added a new join module (contrib/join), enabling indexing and searching of nested (parent/child) documents using BlockJoinQuery/Collector (LUCENE-3171).
* It is now possible to index documents with term frequencies included but without positions (LUCENE-2048); previously omitTermFreqAndPositions always omitted both.
* The modular QueryParser (contrib/queryparser) can now create NumericRangeQuery.
* Added SynonymFilter, in contrib/analyzers, to apply multi-word synonyms during indexing or querying, including parsers to read the wordnet and solr synonym formats (LUCENE-3233).
* You can now control how documents that don't have a value on the sort field should sort (LUCENE-3390), using SortField.setMissingValue.
* Fixed a case where term vectors could be silently deleted from the index after addIndexes (LUCENE-3402).
項目地址:http://lucene.apache.org/
開發文檔:http://www.baiduhome.net/doc/list/125