indexing - Multilingual Search using lucene -


I am searching a multilingual and I will use Lucene as a tool to do this.

I already have translated content, each document will have 3 or 4 languages.

For indexing and searching, there can be 4 strategies for each strategy, for each document / content:

  1. Individual index / directory for each language Is indexed in
  2. Each language is indexed in separate documents, but in the same index.
  3. Each language is indexed in different fields but in the same document.
  4. All languages ​​are indexed in the document in the same area

    But I have not tested all the cases so far, did anyone tell me that someone Is there a better way to multilingual search?

    Thank you!

    In short, it depends on your needs, but I go with option 3 or 1 .

    1) Perhaps the best would be the way, if there is no overlap / shared field between languages.

    3) If there are many fields, ed is to be shared in languages, because it saves disk space and allows a large portion of the index to fit the file system cache.

    I will not recommend 2): This gives your search query more complex and emphasis lucene to consider more documents.

    4) Your search query will be very complex, as long as you do not enable users to search without first selecting any language.

Comments

Popular posts from this blog

mysql - BLOB/TEXT column 'value' used in key specification without a key length -

c# - Using Vici cool Storage with monodroid -

python - referencing a variable in another function? -