Procedimento para colocar Scheduler no Solr

solr
Procedimento para criar uma tarefa usando o DataImportHandler e através de uma Scheduler no próprio Solr fazer essa indexação de tempos em tempos.

A documentação de como adicionar uma scheduler no Solr encontra-se no seguinte endereço:
http://wiki.apache.org/solr/DataImportHandler#Scheduling

$solar.home = pasta principal do solar, no caso utilizando Jetty, é a pasta onde encontra-se o arquivo start.jar do jetty. No meu exemplo descompactei o Solr na pasta C:\Solr\solr-4.4.0 e a pasta home ficou C:\Solr\solr-4.4.0\example.

O procedimento consiste em 3 Passos:

• Criar um projeto Java simples com as classes SolrDataImportProperties.java, ApplicationListener.java e HTTPPostScheduler.java, gerar um jar com essas classes e adicionar na lib do Solr ( Existe um projetinho já pronto neste link https://code.google.com/p/solr-data-import-scheduler/downloads/list)
No caso geramos o jar dihs.jar e adicionamos no $solar.home/solr-webapp\webapp\WEB-INF\lib

• Adicionar o listener no web.xml que se encontra no caminho
$solar.home\solr-webapp\webapp\WEB-INF\web.xml

 <listener>
        <listener-class>
                org.apache.solr.handler.dataimport.scheduler.ApplicationListener
        </listener-class>
       </listener>

• Na pasta $solar.home/solr/conf criar o arquivo dataimport.properties com as informações abaixo, caso essa pasta não existir deve ser criada.

#Tue Jul 21 12:10:50 CEST 2010
metadataObject.last_index_time=2010-09-20 11\:12\:47
last_index_time=2010-09-20 11\:12\:47


#################################################
#                                               #
#       dataimport scheduler properties         #
#                                               #
#################################################

#  to sync or not to sync
#  1 - active; anything else - inactive
syncEnabled=1

#  which cores to schedule
#  in a multi-core environment you can decide which cores you want syncronized
#  leave empty or comment it out if using single-core deployment
syncCores=collection1

#  solr server name or IP address
#  [defaults to localhost if empty]
server=localhost

#  solr server port
#  [defaults to 80 if empty]
port=8983

#  application name/context
#  [defaults to current ServletContextListener's context (app) name]
webapp=solr

#  URL params [mandatory]
#  remainder of URL
params=/select?qt=/dataimport&command=delta-import&clean=false&commit=true

#  schedule interval
#  number of minutes between two runs
#  [defaults to 30 if empty]
interval=2

Na configuração acima o intervalo esta de dois em dois minutos, fazendo isso basta iniciar o Solr e verificar o arquivo de log $solar.home\logs\solr.log

Abaixo segue um exemplo de log:


INFO  - 2013-09-03 13:53:37.279; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Process started at .............. 03.09.2013 13:53:37 279
INFO  - 2013-09-03 13:53:37.280; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Full URL				http://localhost:8983/solr/collection1/select?qt=/dataimport&command=delta-import&clean=false&commit=true
INFO  - 2013-09-03 13:53:37.283; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Request method			POST
INFO  - 2013-09-03 13:53:37.284; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Succesfully connected to server	localhost
INFO  - 2013-09-03 13:53:37.285; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Using port			8983
INFO  - 2013-09-03 13:53:37.287; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Application name			solr
INFO  - 2013-09-03 13:53:37.288; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> URL params			/select?qt=/dataimport&command=delta-import&clean=false&commit=true
INFO  - 2013-09-03 13:53:37.297; org.apache.solr.handler.dataimport.DataImporter; Data Configuration loaded successfully
INFO  - 2013-09-03 13:53:37.297; org.apache.solr.handler.dataimport.DataImporter; Starting Delta Import
INFO  - 2013-09-03 13:53:37.305; org.apache.solr.handler.dataimport.SimplePropertiesWriter; Read dataimport.properties
INFO  - 2013-09-03 13:53:37.307; org.apache.solr.handler.dataimport.DocBuilder; Starting delta collection.
INFO  - 2013-09-03 13:53:37.313; org.apache.solr.handler.dataimport.DocBuilder; Running ModifiedRowKey() for Entity: course
INFO  - 2013-09-03 13:53:37.314; org.apache.solr.handler.dataimport.DocBuilder; Completed ModifiedRowKey for Entity: course rows obtained : 0
INFO  - 2013-09-03 13:53:37.315; org.apache.solr.handler.dataimport.DocBuilder; Completed DeletedRowKey for Entity: course rows obtained : 0
INFO  - 2013-09-03 13:53:37.316; org.apache.solr.handler.dataimport.DocBuilder; Completed parentDeltaQuery for Entity: course
INFO  - 2013-09-03 13:53:37.318; org.apache.solr.handler.dataimport.DocBuilder; Delta Import completed successfully
INFO  - 2013-09-03 13:53:37.319; org.apache.solr.handler.dataimport.DocBuilder; Time taken = 0:0:0.14
INFO  - 2013-09-03 13:53:37.320; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/select params={clean=false&commit=true&command=delta-import&qt=/dataimport} {} 0 29
INFO  - 2013-09-03 13:53:37.322; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Response message			OK
INFO  - 2013-09-03 13:53:37.323; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Response code			200
INFO  - 2013-09-03 13:53:37.326; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Disconnected from server		localhost
INFO  - 2013-09-03 13:53:37.327; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Process ended at ................ 03.09.2013 13:53:37 327
INFO  - 2013-09-03 13:55:37.279; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Process started at .............. 03.09.2013 13:55:37 279
INFO  - 2013-09-03 13:55:37.280; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Full URL				http://localhost:8983/solr/collection1/select?qt=/dataimport&command=delta-import&clean=false&commit=true
INFO  - 2013-09-03 13:55:37.283; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Request method			POST
INFO  - 2013-09-03 13:55:37.284; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Succesfully connected to server	localhost
INFO  - 2013-09-03 13:55:37.286; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Using port			8983
INFO  - 2013-09-03 13:55:37.287; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> Application name			solr
INFO  - 2013-09-03 13:55:37.289; org.apache.solr.handler.dataimport.scheduler.HttpPostScheduler; [collection1] <index update process> URL params			/select?qt=/dataimport&command=delta-import&clean=false&commit=true

Deixe uma resposta

Preencha os seus dados abaixo ou clique em um ícone para log in:

Logotipo do WordPress.com

Você está comentando utilizando sua conta WordPress.com. Sair / Alterar )

Imagem do Twitter

Você está comentando utilizando sua conta Twitter. Sair / Alterar )

Foto do Facebook

Você está comentando utilizando sua conta Facebook. Sair / Alterar )

Foto do Google+

Você está comentando utilizando sua conta Google+. Sair / Alterar )

Conectando a %s