SOLR使用手冊之操作collection


一.Collections API 

參考:https://cwiki.apache.org/confluence/display/solr/Collections+API 

因為API比較多,我就不一一列舉,只列出比較重要的幾個

1.創建collection
官方示例:/admin/collections?action=CREATE&name=name&numShards=number&replicationFactor=number&maxShardsPerNode=number&createNodeSet=nodelist&collection.configName=configname

 (1) 我的示例:

         http://192.168.66.99:8080/solr/admin/collections?action=CREATE&name=test&numShards=2&replicationFactor=2&maxShardsPerNode=3

         name指明collection名稱

numShards指明分片數

replicationFactor指明副本數

maxShardsPerNode 每個節點最大分片數(默認為1)

(2)當我們想指定配置文件,索引目錄時,可以加入如下參數

property.name=value string No   Set core property name to value. See core.properties file contents.
可選參數如下:

key

Description

name

The name of the SolrCore. You'll use this name to reference the SolrCore when running commands with the CoreAdminHandler.

config

The configuration file name for a given core. The default is solrconfig.xml.

schema

The schema file name for a given core. The default is schema.xml

dataDir

Core's data directory as a path relative to the instanceDir, data by default.

configSet If set, the name of the configset to use to configure the core (see Config Sets).

properties

The name of the properties file for this core. The value can be an absolute pathname or a path relative to the value of instanceDir.

transient

If true, the core can be unloaded if Solr reaches the transientCacheSize. The default if not specified is false. Cores are unloaded in order of least recently used first.

loadOnStartup

If true, the default if it is not specified, the core will loaded when Solr starts.

coreNodeName

Added in Solr 4.2, this attributes allows naming a core. The name can then be used later if you need to replace a machine with a new one. By assigning the new machine the same coreNodeName as the old core, it will take over for the old SolrCore.

ulogDir

The absolute or relative directory for the update log for this core (SolrCloud)

shard

The shard to assign this core to (SolrCloud)

collection

The name of the collection this core is part of (SolrCloud)

roles

Future param for SolrCloud or a way for users to mark nodes for their own use.

 

 (3)運行http://192.168.66.99:8080/solr/admin/collections?action=CREATE&name=test&numShards=2&replicationFactor=2&maxShardsPerNode=3&property.schema=schema2.xml&property.dataDir=/usr/local/data/solr

以上命令將會創建collection test,指定schema2.xml作為其schema配置文件,並指定/usr/local/data/solr為其數據存放目錄                                    

(注意如果指定相關配置文件,首先要向zookeeper中上傳相關的配置,運行一下命令將schema2.xml上傳到zookeeper

 java -classpath .:/usr/local/solr/solrhome-1/lib/* org.apache.solr.cloud.ZkCLI -cmd upconfig -zkhost 127.0.0.1:1181,127.0.0.1:2181,127.0.0.1:3181 -confdir /usr/local/solr/solrhome-1/update/  -confname solr-conf

在我本機運行時出現錯:

org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'test_shard1_replica1': Unable to create core: test_shard1_replica1 Caused by: Lock obtain timed out: NativeFSLock@/usr/local/data/solr/index/write.lock

        這是因為3個節點都在我本機,我們將索引目錄指定為同一個,這種創建方式默認的數據文件夾會重復,我們可以分別指定分片文件夾


2.刪除collection
官方示例:/admin/collections?action=DELETE&name=collection
我的示例:http://192.168.66.99:8080/solr/admin/collections?action=DELETE&name=test


3.創建分片
官方示例:/admin/collections?action=CREATESHARD&shard=shardName&collection=name
/admin/collections?action=SPLITSHARD: split a shard into two new shards
我的示例:http://192.168.66.99:8080/solr/admin/collections?action=CREATESHARD&collection=test&shard=shard1&name=test_shard1_replica1&property.schema=schema2.xml&property.dataDir=/usr/local/data/solr/test_shard1_replica1
本人測試,如果collection是使用第1節方式創建的,使用這種方式進行創建分片時,無法正確執行,原因待研究

4.其他

/admin/collections?action=RELOAD: reload a collection
/admin/collections?action=SPLITSHARD: split a shard into two new shards
/admin/collections?action=CREATESHARD: create a new shard
/admin/collections?action=DELETESHARD: delete an inactive shard
/admin/collections?action=CREATEALIAS: create or modify an alias for a collection
/admin/collections?action=DELETEALIAS: delete an alias for a collection
/admin/collections?action=DELETEREPLICA: delete a replica of a shard

/admin/collections?action=ADDREPLICA: add a replica of a shard
/admin/collections?action=CLUSTERPROP: Add/edit/delete a cluster-wide property

/admin/collections?action=MIGRATE: Migrate documents to another collection 
/admin/collections?action=ADDROLE: Add a specific role to a node in the cluster
/admin/collections?action=REMOVEROLE: Remove an assigned role
/admin/collections?action=OVERSEERSTATUS: Get status and statistics of the overseer
/admin/collections?action=CLUSTERSTATUS: Get cluster status
/admin/collections?action=REQUESTSTATUS: Get the status of a previous asynchronous request

/admin/collections?action=LIST: List all collections 

 

二.Cores API

solr的core在我看來是對shard進行各種操作的,一個core可視為一個shard或者其replica的管理,但是也可以創建collection,

參考:https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage

訪問方式: http://localhost:8983/solr/admin/cores?action=action,操作有以下幾種

1.查看狀態
官方示例:http://localhost:8983/solr/admin/cores?action=STATUS&core=core0
 
2.創建core
官方示例:http://localhost:8983/solr/admin/cores?action=CREATE&name=coreX&instanceDir=path/to/dir&config=config_file_name.xml&schema=schem_file_name.xml&dataDir=data
可選參數基本與創建collection相同

Parameter

Description

name

The name of the new core. Same as "name" on the <core> element.

instanceDir

The directory where files for this SolrCore should be stored. Same as instanceDir on the <core> element.

config

(Optional) Name of the config file (solrconfig.xml) relative to instanceDir.

schema

(Optional) Name of the schema file (schema.xml) relative to instanceDir.

datadir

(Optional) Name of the data directory relative to instanceDir.

configSet (Optional) Name of the configset to use for this core (see Config Sets)

collection

(Optional) The name of the collection to which this core belongs. The default is the name of the core. collection.<param>=<value> causes a property of <param>=<value> to be set if a new collection is being created. Use collection.configName=<configname> to point to the configuration for a new collection.

shard

(Optional) The shard id this core represents. Normally you want to be auto-assigned a shard id.

property.name=value (Optional) Sets the core property name to value. See core.properties file contents.
async (Optional) Request ID to track this action which will be processed asynchronously

我的示例:

http://192.168.66.99:8080/solr/admin/cores?action=CREATE&name=test&collection=test&shard=shard1&instanceDir=/usr/local/data/solr/solr-1/test/&schema=schema2.xml

 

name指明core名稱 該名稱為solrhome下的文件夾名稱,該文件夾下存放該分片的數據文件

collection指明collection名稱 若collection 不存在則創建 若存在則判斷shard

shard指明分片名稱 若shard不存在,則創建 若存在則創建一個該分片的副本

該命令會在 http://192.168.66.99:8080上創建一個名為test的collection,並且創建一個名為shard1的分片,並且該機器為這個分片的leader

 

http://192.168.66.99:8080/solr/admin/cores?action=CREATE&name=test_shard1_replica_2&collection=test&shard=shard1

該命令會在 http://192.168.66.99:8080上為test創建shard1的副本

 

3.刷新core

官方示例:http://localhost:8983/solr/admin/cores?action=RELOAD&core=core0

4.重命名core

官方示例:http://localhost:8983/solr/admin/cores?action=RENAME&core=core0&other=core5

5.交換core

官方示例:http://localhost:8983/solr/admin/cores?action=SWAP&core=core1&other=core0

6.下線core

官方示例:http://localhost:8983/solr/admin/cores?action=UNLOAD&core=core0

可選參數:

                

  1. deleteIndex: if true, will remove the index when unloading the core.
  • deleteDataDir: if true, removes the data directory and all sub-directories.
  • deleteInstanceDir: if true, removes everything related to the core, including the index directory, configuration files, and other related files.
  • async: if set to a value, makes the call asynchronous. This call can then be tracked using the REQUESTSTATUS API.

7.合並索引

 

官方示例:

方式1:http://localhost:8983/solr/admin/cores?action=MERGEINDEXES&core=core0&indexDir=/opt/solr/core1/data/index&indexDir=/opt/solr/core2/data/index

方式2:http://localhost:8983/solr/admin/cores?action=mergeindexes&core=core0&srcCore=core1&srcCore=core2

 

8.切分

官方示例:http://localhost:8983/solr/admin/cores?action=SPLIT&core=core0&targetCore=core1&targetCore=core2

可選參數:

Parameter

Description

Multi-valued

core

The name of the core to be split.

false

path

The directory path in which a piece of the index will be written.

true

targetCore

The target Solr core to which a piece of the index will be merged

true

ranges

A comma-separated list of hash ranges in hexadecimal format

false

split.key

The key to be used for splitting the index

false

async (Optional) Request ID to track this action which will be processed asynchronously false
 

9.查看請求狀態

官方示例:http://localhost:8983/solr/admin/cores?action=REQUESTSTATUS&requestid=1



三.collection實踐拓展

上述API提供給了我們一組操作collection和core的方法,現在來想一想實際場景中可能遇到的問題

1.場景1新增collection

搭建完solrcloud后我們首先要考慮的就是建立collection,並對其進行分片,我們有兩種方式來做這件事

(1)讓solrcloud自動幫我們分片,指定分片名稱等,即運行命令:

     http://192.168.66.99:8080/solr/admin/collections?action=CREATE&name=test&numShards=2&replicationFactor=2&maxShardsPerNode=3

(2)自己指定每個分片的機器,即分別運行命令:

    http://192.168.66.99:7080/solr/admin/cores?action=CREATE&name=test_shard1_replica_1&collection=test&shard=shard1

    ...

這兩種方式均可以指定配置文件,及存儲路徑

 

2.場景2-擴容

隨着數據量和訪問量的增大,我們需要對solrcloud進行擴容,以維持其運行,這又可能包含兩種場景

(1)增加一個collection shard

方式一:使用action=SPLITSHARD將一個分片切分成兩塊,然后再進行重命名等其他操作

方式二:使用cores?action=CREATE&name=test&collection=test&shard=shard1直接創建

(2)增加一個shard的副本

同樣使用cores?action=CREATE&name=test&collection=test&shard=shard1直接創建

 

3.場景3-更換服務器

個人建議如下,先將新服務器加入solrcloud,同步索引文件,然后再下線老服務器,安全快捷直接通過管理界面即可實現

通過以上場景可以發現,使用core api在實際情況下可能更加快捷,因此可以重點學習


4.另外,有時我們在配置solrcloud過程中可能會出現各種配置錯誤,這種錯誤會在solrcloud的管理界面進行提示,比如配置collection時指定schema.xml而在zookeeper中並不存在指定的文件
這時solrcloud就會提示:
test3_shard2_replica1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not load core configuration for core test3_shard2_replica1
如何處理這種錯誤呢:
    (1)刪除solrhome下的相關文件夾
    (2)挨個重啟solrcloud節點


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM