背景:
MongoDB和MySQL一樣,都會產生慢查詢,所以都需要對其進行優化:包括創建索引、重構查詢等。現在就說明在MongoDB下的索引相關知識點,可以通過這篇文章MongoDB 查詢優化分析了解MongoDB慢查詢的一些特點。
執行計划分析:
因為MongoDB也是BTree索引,所以使用上和MySQL大致一樣。通過explain查看一個query的執行計划,來判斷如何加索引,explain在3.0版本的時候做了一些改進,現在針對這2個版本進行分析:
3.0之前:
zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain() { "cursor" : "BtreeCursor b_1_date_1", #游標類型:BasicCursor(全表掃描)、BtreeCursor(BTree索引掃描)、GeoSearchCursor(地理空間索引掃描)。 "isMultiKey" : false, "n" : 324, #返回的結果數,count()。 "nscannedObjects" : 324, #掃描的對象 "nscanned" : 324, #掃描的索引數 "nscannedObjectsAllPlans" : 324, #代表所有嘗試執行的計划所掃描的對象 "nscannedAllPlans" : 324, #代表所有嘗試執行的計划所掃描的索引 "scanAndOrder" : false, #True:對文檔進行排序,false:對索引進行排序 "indexOnly" : false, #對查詢的結果進行排序不需要搜索其他文檔,查詢和返回字段使用同一索引 "nYields" : 0, #為了讓寫操作執行而讓出讀鎖的次數 "nChunkSkips" : 0, #忽略文檔數 "millis" : 1, #執行查詢消耗的時間 "indexBounds" : { #索引掃描中使用的最大/小值。 "b" : [ [ "CYHS1301942", "CYHS1301942" ] ], "date" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ] }, "server" : "db-mongo1:27017" }
3.0之后:在explain()里有三個參數:"queryPlanner", "executionStats", and "allPlansExecution",默認是:queryPlanner。具體的含義見官方文檔。
zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain() { "queryPlanner" : { "plannerVersion" : 1, "namespace" : "cde.newtask", #集合 "indexFilterSet" : false, "parsedQuery" : { "b" : { "$eq" : "CYHS1301942" } }, "winningPlan" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", #索引掃描,COLLSCAN表示全表掃描。 "keyPattern" : { "b" : 1, "date" : 1 }, "indexName" : "b_1_date_1", #索引名 "isMultiKey" : false, "direction" : "forward", "indexBounds" : { "b" : [ "[\"CYHS1301942\", \"CYHS1301942\"]" ], "date" : [ "[MinKey, MaxKey]" ] } } }, "rejectedPlans" : [ ] }, "serverInfo" : { "host" : "mongo1", "port" : 27017, "version" : "3.0.4", "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5" }, "ok" : 1 }
3.0要是查看更詳細的執行計划請看其他2個參數:

zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain("allPlansExecution") { "queryPlanner" : { "plannerVersion" : 1, "namespace" : "cde.newtask", "indexFilterSet" : false, "parsedQuery" : { "b" : { "$eq" : "CYHS1301942" } }, "winningPlan" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "b" : 1, "date" : 1 }, "indexName" : "b_1_date_1", "isMultiKey" : false, "direction" : "forward", "indexBounds" : { "b" : [ "[\"CYHS1301942\", \"CYHS1301942\"]" ], "date" : [ "[MinKey, MaxKey]" ] } } }, "rejectedPlans" : [ ] }, "executionStats" : { "executionSuccess" : true, "nReturned" : 1, "executionTimeMillis" : 0, "totalKeysExamined" : 1, "totalDocsExamined" : 1, "executionStages" : { "stage" : "FETCH", "nReturned" : 1, "executionTimeMillisEstimate" : 0, "works" : 2, "advanced" : 1, "needTime" : 0, "needFetch" : 0, "saveState" : 0, "restoreState" : 0, "isEOF" : 1, "invalidates" : 0, "docsExamined" : 1, "alreadyHasObj" : 0, "inputStage" : { "stage" : "IXSCAN", "nReturned" : 1, "executionTimeMillisEstimate" : 0, "works" : 2, "advanced" : 1, "needTime" : 0, "needFetch" : 0, "saveState" : 0, "restoreState" : 0, "isEOF" : 1, "invalidates" : 0, "keyPattern" : { "b" : 1, "date" : 1 }, "indexName" : "b_1_date_1", "isMultiKey" : false, "direction" : "forward", "indexBounds" : { "b" : [ "[\"CYHS1301942\", \"CYHS1301942\"]" ], "date" : [ "[MinKey, MaxKey]" ] }, "keysExamined" : 1, "dupsTested" : 0, "dupsDropped" : 0, "seenInvalidated" : 0, "matchTested" : 0 } }, "allPlansExecution" : [ ] }, "serverInfo" : { "host" : "mongo1", "port" : 27017, "version" : "3.0.4", "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5" }, "ok" : 1 }

zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain("executionStats") { "queryPlanner" : { "plannerVersion" : 1, "namespace" : "cde.newtask", "indexFilterSet" : false, "parsedQuery" : { "b" : { "$eq" : "CYHS1301942" } }, "winningPlan" : { "stage" : "FETCH", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "b" : 1, "date" : 1 }, "indexName" : "b_1_date_1", "isMultiKey" : false, "direction" : "forward", "indexBounds" : { "b" : [ "[\"CYHS1301942\", \"CYHS1301942\"]" ], "date" : [ "[MinKey, MaxKey]" ] } } }, "rejectedPlans" : [ ] }, "executionStats" : { "executionSuccess" : true, "nReturned" : 1, "executionTimeMillis" : 0, "totalKeysExamined" : 1, "totalDocsExamined" : 1, "executionStages" : { "stage" : "FETCH", "nReturned" : 1, "executionTimeMillisEstimate" : 0, "works" : 2, "advanced" : 1, "needTime" : 0, "needFetch" : 0, "saveState" : 0, "restoreState" : 0, "isEOF" : 1, "invalidates" : 0, "docsExamined" : 1, "alreadyHasObj" : 0, "inputStage" : { "stage" : "IXSCAN", "nReturned" : 1, "executionTimeMillisEstimate" : 0, "works" : 2, "advanced" : 1, "needTime" : 0, "needFetch" : 0, "saveState" : 0, "restoreState" : 0, "isEOF" : 1, "invalidates" : 0, "keyPattern" : { "b" : 1, "date" : 1 }, "indexName" : "b_1_date_1", "isMultiKey" : false, "direction" : "forward", "indexBounds" : { "b" : [ "[\"CYHS1301942\", \"CYHS1301942\"]" ], "date" : [ "[MinKey, MaxKey]" ] }, "keysExamined" : 1, "dupsTested" : 0, "dupsDropped" : 0, "seenInvalidated" : 0, "matchTested" : 0 } } }, "serverInfo" : { "host" : "mongo1", "port" : 27017, "version" : "3.0.4", "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5" }, "ok" : 1 }
上面介紹了如何查看執行計划,那么下面介紹下如何管理索引。
索引管理,具體請看[權威指南第5章]
1)查看/顯示集合的索引:db.collectionName.getIndexes()或則db.system.indexes.find()
zjy:PRIMARY> db.data.getIndexes() [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", #索引名 "ns" : "survey.data" #集合名 }, { "v" : 1, "unique" : true, #唯一索引 "key" : { "sid" : 1, "user" : 1 }, "name" : "sid_1_user_1", "ns" : "survey.data" }, { "v" : 1, "key" : { "sid" : 1, "cdate" : -1 }, "name" : "sid_1_cdate_-1", "ns" : "survey.data" }, { "v" : 1, "key" : { "sid" : 1, "created" : -1 }, "name" : "sid_1_created_-1", "ns" : "survey.data" }, { "v" : 1, "key" : { "sid" : 1, "user" : 1, "modified" : 1 }, "name" : "sid_1_user_1_modified_1", "ns" : "survey.data" } ]
zjy:PRIMARY> db.system.indexes.find({"ns":"survey.data"}) { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "survey.data" } { "v" : 1, "unique" : true, "key" : { "sid" : 1, "user" : 1 }, "name" : "sid_1_user_1", "ns" : "survey.data" } { "v" : 1, "key" : { "sid" : 1, "cdate" : -1 }, "name" : "sid_1_cdate_-1", "ns" : "survey.data" } { "v" : 1, "key" : { "sid" : 1, "created" : -1 }, "name" : "sid_1_created_-1", "ns" : "survey.data" } { "v" : 1, "key" : { "sid" : 1, "user" : 1, "modified" : 1 }, "name" : "sid_1_user_1_modified_1", "ns" : "survey.data" }
2)創建索引:db.collections.ensureIndex({...})
普通索引
zjy:PRIMARY> db.comments.ensureIndex({"name":1}) #name字段上創建索引,升序。倒序為-1。 { "createdCollectionAutomatically" : false, "numIndexesBefore" : 2, "numIndexesAfter" : 3, "ok" : 1 } zjy:PRIMARY> db.comments.ensureIndex({"account.name":1}) #內嵌文檔上創建索引。 { "createdCollectionAutomatically" : false, "numIndexesBefore" : 3, "numIndexesAfter" : 4, "ok" : 1 } zjy:PRIMARY> db.comments.ensureIndex({"age":1},{"name":"idx_name"}) #指定索引名稱 { "createdCollectionAutomatically" : false, "numIndexesBefore" : 4, "numIndexesAfter" : 5, "ok" : 1 } zjy:PRIMARY> db.comments.ensureIndex({"name":1,"age":1},{"name":"idx_name_age","background":true}) #后台創建復合索引 { "createdCollectionAutomatically" : false, "numIndexesBefore" : 5, "numIndexesAfter" : 6, "ok" : 1 } zjy:PRIMARY> db.comments.ensureIndex({"name":1,"age":1},{"name":"uk_name_age","background":true,"unique":true}) #后台創建唯一索引 { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } zjy:PRIMARY> db.comments.ensureIndex({"name":1,"age":1},{"unique":true,"dropDups":true,"name":"uk_name_age"}) #刪除重復數據創建唯一索引,dropDups在3.0里廢棄。 { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
哈希索引:hashed
zjy:PRIMARY> db.abc.ensureIndex({"a":"hashed"}) { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } zjy:PRIMARY> db.abc.getIndexes() [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.abc" }, { "v" : 1, "key" : { "a" : "hashed" }, "name" : "a_hashed", "ns" : "test.abc" } ]
這里還有2個比較特殊的索引:稀疏索引(sparse)和TTL索引(expireAfterSeconds)
TTL索引是一種特定的數據塊,請求賦予時間范圍的方式,它指定一個時間點,超過該時間點數據變成無效。
zjy:PRIMARY> db.comments.find() { "_id" : ObjectId("55ae6b99313fd7b879b5296c"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:09.651Z") } { "_id" : ObjectId("55ae6b9a313fd7b879b5296d"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:10.739Z") } { "_id" : ObjectId("55ae6b9b313fd7b879b5296e"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:11.555Z") } { "_id" : ObjectId("55ae6b9c313fd7b879b5296f"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:12.267Z") } { "_id" : ObjectId("55ae6b9c313fd7b879b52970"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:12.899Z") } zjy:PRIMARY> db.comments.ensureIndex({"ts":1},{expireAfterSeconds:60}) #創建TTL索引,過期時間60秒,即60秒時間生成的數據會被刪除。 { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 } zjy:PRIMARY> db.comments.find() { "_id" : ObjectId("55ae6b99313fd7b879b5296c"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:09.651Z") } { "_id" : ObjectId("55ae6b9a313fd7b879b5296d"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:10.739Z") } { "_id" : ObjectId("55ae6b9b313fd7b879b5296e"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:11.555Z") } { "_id" : ObjectId("55ae6b9c313fd7b879b5296f"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:12.267Z") } { "_id" : ObjectId("55ae6b9c313fd7b879b52970"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:12.899Z") } zjy:PRIMARY> db.comments.getIndexes() [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.comments" }, { "v" : 1, "key" : { "ts" : 1 }, "name" : "ts_1", "ns" : "test.comments", "expireAfterSeconds" : 60 } ] zjy:PRIMARY> db.comments.find() #60秒之后查看,數據已經沒有
最后有一類索引是text index 文本索引:更多的信息見 [MongoDB大數據處理權威指南第八章]和這里
測試數據:

db.comments.insert({"name":"abc","mem":"You can create a text index on the field or fields whose value is a string or an array of string elements","ts":new Date()}) db.comments.insert({"name":"def","mem":"When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**)","ts":new Date()}) db.comments.insert({"name":"ghi","mem":"This text index catalogs all string data in the subject field and the content field, where the field value is either a string or an array of string elements.","ts":new Date()}) db.comments.insert({"name":"jkl","mem":"To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content.","ts":new Date()}) db.comments.insert({"name":"mno","mem":"The following example indexes any string value in the data of every field of every document in collection and names the index TextIndex:","ts":new Date()})
創建:
> db.comments.ensureIndex({"mem":"text"}) #創建text索引 { "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }
使用:$text 操作符
> db.comments.find({$text:{$search:"specifier"}}).pretty() { "_id" : ObjectId("55aee886a782f35b366926ef"), "name" : "jkl", "mem" : "To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content.", "ts" : ISODate("2015-07-22T00:49:10.350Z") } { "_id" : ObjectId("55aee886a782f35b366926ed"), "name" : "def", "mem" : "When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**)", "ts" : ISODate("2015-07-22T00:49:10.346Z") } > db.comments.runCommand("text",{search:"specifier"}) #3.0之前可以使用,之后無效。 { "results" : [ { "score" : 0.8653846153846153, "obj" : { "_id" : ObjectId("55aee886a782f35b366926ed"), "name" : "def", "mem" : "When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**)", "ts" : ISODate("2015-07-22T00:49:10.346Z") } }, { "score" : 0.5357142857142857, "obj" : { "_id" : ObjectId("55aee886a782f35b366926ef"), "name" : "jkl", "mem" : "To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content.", "ts" : ISODate("2015-07-22T00:49:10.350Z") } } ], "stats" : { "nscanned" : NumberLong(2), "nscannedObjects" : NumberLong(2), "n" : 2, "timeMicros" : 173 }, "ok" : 1 }
上面大致介紹了各類索引的介紹和使用,具體的信息和注意事項可以找官方文檔里查看,特別是要注意text和ttl索引的使用。
3)刪除索引:dropIndex
zjy:PRIMARY> db.abc.getIndexes() #查看索引 [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.abc" }, { "v" : 1, "key" : { #索引字段 "a" : "hashed" }, "name" : "a_hashed", #索引名 "ns" : "test.abc" }, { "v" : 1, "key" : { "b" : 1 }, "name" : "b_1", "ns" : "test.abc" }, { "v" : 1, "key" : { "c" : 1 }, "name" : "idx_c", "ns" : "test.abc" } ] zjy:PRIMARY> db.abc.dropIndex({"a" : "hashed"}) #刪除索引,指定"key" { "nIndexesWas" : 4, "ok" : 1 } zjy:PRIMARY> db.abc.dropIndex({"b" : 1}) #刪除索引,指定"key" { "nIndexesWas" : 3, "ok" : 1 } zjy:PRIMARY> db.abc.dropIndex("idx_c") #刪除索引,指定"name" { "nIndexesWas" : 2, "ok" : 1 } zjy:PRIMARY> db.abc.getIndexes() [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.abc" } ] zjy:PRIMARY> db.abc.dropIndex("*") #刪除索引,刪除集合的全部索引 { "nIndexesWas" : 4, "msg" : "non-_id indexes dropped for collection", "ok" : 1 }
4)重建索引:索引出現損壞需要重建。reindex
zjy:PRIMARY> db.abc.reIndex() #執行 { "nIndexesWas" : 1, "nIndexes" : 1, "indexes" : [ { "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.abc" } ], "ok" : 1 }
5)強制使用指定索引。hint
db.abc.find({"c":1,"b":2}).hint("b_1") #hint里面是"索引字段"或則"索引名"
總結:
索引可以加快檢索、排序等操作的效率,但是對於增刪改的操作卻有一定的開銷,所以不要一味的加索引,在必要的字段上加合適的索引才是需要的。更多的信息請參考官方文檔。