MongoDB 索引相關知識

本文轉載自查看原文 2015-07-22 10:21 8879 expain/ NoSQL/ 隨筆/ index/ ensureIndex

背景：

MongoDB和MySQL一樣，都會產生慢查詢，所以都需要對其進行優化：包括創建索引、重構查詢等。現在就說明在MongoDB下的索引相關知識點，可以通過這篇文章MongoDB 查詢優化分析了解MongoDB慢查詢的一些特點。

執行計划分析：

因為MongoDB也是BTree索引，所以使用上和MySQL大致一樣。通過explain查看一個query的執行計划，來判斷如何加索引，explain在3.0版本的時候做了一些改進，現在針對這2個版本進行分析：

3.0之前：

zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain()
{
    "cursor" : "BtreeCursor b_1_date_1", #游標類型：BasicCursor(全表掃描)、BtreeCursor(BTree索引掃描)、GeoSearchCursor(地理空間索引掃描)。
    "isMultiKey" : false,
    "n" : 324,  #返回的結果數,count()。
    "nscannedObjects" : 324, #掃描的對象
    "nscanned" : 324,        #掃描的索引數
    "nscannedObjectsAllPlans" : 324, #代表所有嘗試執行的計划所掃描的對象
    "nscannedAllPlans" : 324,        #代表所有嘗試執行的計划所掃描的索引
    "scanAndOrder" : false,          #True：對文檔進行排序，false：對索引進行排序
    "indexOnly" : false,             #對查詢的結果進行排序不需要搜索其他文檔，查詢和返回字段使用同一索引
    "nYields" : 0,                   #為了讓寫操作執行而讓出讀鎖的次數
    "nChunkSkips" : 0,               #忽略文檔數
    "millis" : 1,                    #執行查詢消耗的時間
    "indexBounds" : {   #索引掃描中使用的最大/小值。
        "b" : [
            [
                "CYHS1301942",
                "CYHS1301942"
            ]
        ],
        "date" : [
            [
                {
                    "$minElement" : 1
                },
                {
                    "$maxElement" : 1
                }
            ]
        ]
    },
    "server" : "db-mongo1:27017"
}

3.0之后：在explain()里有三個參數："queryPlanner", "executionStats", and "allPlansExecution"，默認是：queryPlanner。具體的含義見官方文檔。

zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain()
{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "cde.newtask",    #集合
        "indexFilterSet" : false,
        "parsedQuery" : {
            "b" : {
                "$eq" : "CYHS1301942"
            }
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "inputStage" : {
                "stage" : "IXSCAN",     #索引掃描，COLLSCAN表示全表掃描。
                "keyPattern" : {
                    "b" : 1,
                    "date" : 1
                },
                "indexName" : "b_1_date_1", #索引名
                "isMultiKey" : false,
                "direction" : "forward",
                "indexBounds" : {
                    "b" : [
                        "[\"CYHS1301942\", \"CYHS1301942\"]"
                    ],
                    "date" : [
                        "[MinKey, MaxKey]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
        "host" : "mongo1",
        "port" : 27017,
        "version" : "3.0.4",
        "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5"
    },
    "ok" : 1
}

3.0要是查看更詳細的執行計划請看其他2個參數：

zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain("allPlansExecution")
{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "cde.newtask",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "b" : {
                "$eq" : "CYHS1301942"
            }
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "b" : 1,
                    "date" : 1
                },
                "indexName" : "b_1_date_1",
                "isMultiKey" : false,
                "direction" : "forward",
                "indexBounds" : {
                    "b" : [
                        "[\"CYHS1301942\", \"CYHS1301942\"]"
                    ],
                    "date" : [
                        "[MinKey, MaxKey]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "executionStats" : {
        "executionSuccess" : true,
        "nReturned" : 1,
        "executionTimeMillis" : 0,
        "totalKeysExamined" : 1,
        "totalDocsExamined" : 1,
        "executionStages" : {
            "stage" : "FETCH",
            "nReturned" : 1,
            "executionTimeMillisEstimate" : 0,
            "works" : 2,
            "advanced" : 1,
            "needTime" : 0,
            "needFetch" : 0,
            "saveState" : 0,
            "restoreState" : 0,
            "isEOF" : 1,
            "invalidates" : 0,
            "docsExamined" : 1,
            "alreadyHasObj" : 0,
            "inputStage" : {
                "stage" : "IXSCAN",
                "nReturned" : 1,
                "executionTimeMillisEstimate" : 0,
                "works" : 2,
                "advanced" : 1,
                "needTime" : 0,
                "needFetch" : 0,
                "saveState" : 0,
                "restoreState" : 0,
                "isEOF" : 1,
                "invalidates" : 0,
                "keyPattern" : {
                    "b" : 1,
                    "date" : 1
                },
                "indexName" : "b_1_date_1",
                "isMultiKey" : false,
                "direction" : "forward",
                "indexBounds" : {
                    "b" : [
                        "[\"CYHS1301942\", \"CYHS1301942\"]"
                    ],
                    "date" : [
                        "[MinKey, MaxKey]"
                    ]
                },
                "keysExamined" : 1,
                "dupsTested" : 0,
                "dupsDropped" : 0,
                "seenInvalidated" : 0,
                "matchTested" : 0
            }
        },
        "allPlansExecution" : [ ]
    },
    "serverInfo" : {
        "host" : "mongo1",
        "port" : 27017,
        "version" : "3.0.4",
        "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5"
    },
    "ok" : 1
}

View Code

zjy:PRIMARY> db.newtask.find({"b":"CYHS1301942"}).explain("executionStats")
{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "cde.newtask",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "b" : {
                "$eq" : "CYHS1301942"
            }
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "inputStage" : {
                "stage" : "IXSCAN",
                "keyPattern" : {
                    "b" : 1,
                    "date" : 1
                },
                "indexName" : "b_1_date_1",
                "isMultiKey" : false,
                "direction" : "forward",
                "indexBounds" : {
                    "b" : [
                        "[\"CYHS1301942\", \"CYHS1301942\"]"
                    ],
                    "date" : [
                        "[MinKey, MaxKey]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "executionStats" : {
        "executionSuccess" : true,
        "nReturned" : 1,
        "executionTimeMillis" : 0,
        "totalKeysExamined" : 1,
        "totalDocsExamined" : 1,
        "executionStages" : {
            "stage" : "FETCH",
            "nReturned" : 1,
            "executionTimeMillisEstimate" : 0,
            "works" : 2,
            "advanced" : 1,
            "needTime" : 0,
            "needFetch" : 0,
            "saveState" : 0,
            "restoreState" : 0,
            "isEOF" : 1,
            "invalidates" : 0,
            "docsExamined" : 1,
            "alreadyHasObj" : 0,
            "inputStage" : {
                "stage" : "IXSCAN",
                "nReturned" : 1,
                "executionTimeMillisEstimate" : 0,
                "works" : 2,
                "advanced" : 1,
                "needTime" : 0,
                "needFetch" : 0,
                "saveState" : 0,
                "restoreState" : 0,
                "isEOF" : 1,
                "invalidates" : 0,
                "keyPattern" : {
                    "b" : 1,
                    "date" : 1
                },
                "indexName" : "b_1_date_1",
                "isMultiKey" : false,
                "direction" : "forward",
                "indexBounds" : {
                    "b" : [
                        "[\"CYHS1301942\", \"CYHS1301942\"]"
                    ],
                    "date" : [
                        "[MinKey, MaxKey]"
                    ]
                },
                "keysExamined" : 1,
                "dupsTested" : 0,
                "dupsDropped" : 0,
                "seenInvalidated" : 0,
                "matchTested" : 0
            }
        }
    },
    "serverInfo" : {
        "host" : "mongo1",
        "port" : 27017,
        "version" : "3.0.4",
        "gitVersion" : "0481c958daeb2969800511e7475dc66986fa9ed5"
    },
    "ok" : 1
}

View Code

上面介紹了如何查看執行計划，那么下面介紹下如何管理索引。

索引管理，具體請看[權威指南第5章]

1)查看/顯示集合的索引：db.collectionName.getIndexes()或則db.system.indexes.find()

zjy:PRIMARY> db.data.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",       #索引名
        "ns" : "survey.data"   #集合名
    },
    {
        "v" : 1,
        "unique" : true,       #唯一索引
        "key" : {
            "sid" : 1,
            "user" : 1
        },
        "name" : "sid_1_user_1",
        "ns" : "survey.data"
    },
    {
        "v" : 1,
        "key" : {
            "sid" : 1,
            "cdate" : -1
        },
        "name" : "sid_1_cdate_-1",
        "ns" : "survey.data"
    },
    {
        "v" : 1,
        "key" : {
            "sid" : 1,
            "created" : -1
        },
        "name" : "sid_1_created_-1",
        "ns" : "survey.data"
    },
    {
        "v" : 1,
        "key" : {
            "sid" : 1,
            "user" : 1,
            "modified" : 1
        },
        "name" : "sid_1_user_1_modified_1",
        "ns" : "survey.data"
    }
]

zjy:PRIMARY> db.system.indexes.find({"ns":"survey.data"})
{ "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "survey.data" }
{ "v" : 1, "unique" : true, "key" : { "sid" : 1, "user" : 1 }, "name" : "sid_1_user_1", "ns" : "survey.data" }
{ "v" : 1, "key" : { "sid" : 1, "cdate" : -1 }, "name" : "sid_1_cdate_-1", "ns" : "survey.data" }
{ "v" : 1, "key" : { "sid" : 1, "created" : -1 }, "name" : "sid_1_created_-1", "ns" : "survey.data" }
{ "v" : 1, "key" : { "sid" : 1, "user" : 1, "modified" : 1 }, "name" : "sid_1_user_1_modified_1", "ns" : "survey.data" }

2）創建索引：db.collections.ensureIndex({...})

普通索引

zjy:PRIMARY> db.comments.ensureIndex({"name":1})  #name字段上創建索引，升序。倒序為-1。
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 2,
    "numIndexesAfter" : 3,
    "ok" : 1
}

zjy:PRIMARY> db.comments.ensureIndex({"account.name":1}) #內嵌文檔上創建索引。
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 3,
    "numIndexesAfter" : 4,
    "ok" : 1
}

zjy:PRIMARY> db.comments.ensureIndex({"age":1},{"name":"idx_name"}) #指定索引名稱
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 4,
    "numIndexesAfter" : 5,
    "ok" : 1
}

zjy:PRIMARY> db.comments.ensureIndex({"name":1,"age":1},{"name":"idx_name_age","background":true}) #后台創建復合索引
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 5,
    "numIndexesAfter" : 6,
    "ok" : 1
}

zjy:PRIMARY> db.comments.ensureIndex({"name":1,"age":1},{"name":"uk_name_age","background":true,"unique":true}) #后台創建唯一索引
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}
zjy:PRIMARY> db.comments.ensureIndex({"name":1,"age":1},{"unique":true,"dropDups":true,"name":"uk_name_age"})   #刪除重復數據創建唯一索引，dropDups在3.0里廢棄。
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}

哈希索引：hashed

zjy:PRIMARY> db.abc.ensureIndex({"a":"hashed"})
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}
zjy:PRIMARY> db.abc.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test.abc"
    },
    {
        "v" : 1,
        "key" : {
            "a" : "hashed"
        },
        "name" : "a_hashed",
        "ns" : "test.abc"
    }
]

這里還有2個比較特殊的索引：稀疏索引（sparse）和TTL索引（expireAfterSeconds）

TTL索引是一種特定的數據塊，請求賦予時間范圍的方式，它指定一個時間點，超過該時間點數據變成無效。

zjy:PRIMARY> db.comments.find()
{ "_id" : ObjectId("55ae6b99313fd7b879b5296c"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:09.651Z") }
{ "_id" : ObjectId("55ae6b9a313fd7b879b5296d"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:10.739Z") }
{ "_id" : ObjectId("55ae6b9b313fd7b879b5296e"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:11.555Z") }
{ "_id" : ObjectId("55ae6b9c313fd7b879b5296f"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:12.267Z") }
{ "_id" : ObjectId("55ae6b9c313fd7b879b52970"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:12.899Z") }
zjy:PRIMARY> db.comments.ensureIndex({"ts":1},{expireAfterSeconds:60})  #創建TTL索引，過期時間60秒，即60秒時間生成的數據會被刪除。
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}
zjy:PRIMARY> db.comments.find()
{ "_id" : ObjectId("55ae6b99313fd7b879b5296c"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:09.651Z") }
{ "_id" : ObjectId("55ae6b9a313fd7b879b5296d"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:10.739Z") }
{ "_id" : ObjectId("55ae6b9b313fd7b879b5296e"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:11.555Z") }
{ "_id" : ObjectId("55ae6b9c313fd7b879b5296f"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:12.267Z") }
{ "_id" : ObjectId("55ae6b9c313fd7b879b52970"), "name" : "zhoujy", "age" : 22, "ts" : ISODate("2015-07-21T15:56:12.899Z") }

zjy:PRIMARY> db.comments.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test.comments"
    },
    {
        "v" : 1,
        "key" : {
            "ts" : 1
        },
        "name" : "ts_1",
        "ns" : "test.comments",
        "expireAfterSeconds" : 60
    }
]

zjy:PRIMARY> db.comments.find() #60秒之后查看，數據已經沒有

最后有一類索引是text index 文本索引：更多的信息見 [MongoDB大數據處理權威指南第八章]和這里

測試數據：

db.comments.insert({"name":"abc","mem":"You can create a text index on the field or fields whose value is a string or an array of string elements","ts":new Date()})

db.comments.insert({"name":"def","mem":"When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**)","ts":new Date()})

db.comments.insert({"name":"ghi","mem":"This text index catalogs all string data in the subject field and the content field, where the field value is either a string or an array of string elements.","ts":new Date()})

db.comments.insert({"name":"jkl","mem":"To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content.","ts":new Date()})

db.comments.insert({"name":"mno","mem":"The following example indexes any string value in the data of every field of every document in collection and names the index TextIndex:","ts":new Date()})

View Code

創建:

> db.comments.ensureIndex({"mem":"text"})   #創建text索引
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}

使用：$text 操作符

> db.comments.find({$text:{$search:"specifier"}}).pretty()
{
    "_id" : ObjectId("55aee886a782f35b366926ef"),
    "name" : "jkl",
    "mem" : "To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content.",
    "ts" : ISODate("2015-07-22T00:49:10.350Z")
}
{
    "_id" : ObjectId("55aee886a782f35b366926ed"),
    "name" : "def",
    "mem" : "When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**)",
    "ts" : ISODate("2015-07-22T00:49:10.346Z")
}


> db.comments.runCommand("text",{search:"specifier"}) #3.0之前可以使用，之后無效。
{
    "results" : [
        {
            "score" : 0.8653846153846153,
            "obj" : {
                "_id" : ObjectId("55aee886a782f35b366926ed"),
                "name" : "def",
                "mem" : "When creating a text index on multiple fields, you can specify the individual fields or you can use wildcard specifier ($**)",
                "ts" : ISODate("2015-07-22T00:49:10.346Z")
            }
        },
        {
            "score" : 0.5357142857142857,
            "obj" : {
                "_id" : ObjectId("55aee886a782f35b366926ef"),
                "name" : "jkl",
                "mem" : "To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content.",
                "ts" : ISODate("2015-07-22T00:49:10.350Z")
            }
        }
    ],
    "stats" : {
        "nscanned" : NumberLong(2),
        "nscannedObjects" : NumberLong(2),
        "n" : 2,
        "timeMicros" : 173
    },
    "ok" : 1
}

上面大致介紹了各類索引的介紹和使用，具體的信息和注意事項可以找官方文檔里查看，特別是要注意text和ttl索引的使用。

3）刪除索引：dropIndex

zjy:PRIMARY> db.abc.getIndexes()    #查看索引
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test.abc"
    },
    {
        "v" : 1,
        "key" : {               #索引字段 "a" : "hashed"
        },
        "name" : "a_hashed",    #索引名 "ns" : "test.abc"
    },
    {
        "v" : 1,
        "key" : {
            "b" : 1
        },
        "name" : "b_1",
        "ns" : "test.abc"
    },
    {
        "v" : 1,
        "key" : {
            "c" : 1
        },
        "name" : "idx_c",
        "ns" : "test.abc"
    }
]
zjy:PRIMARY> db.abc.dropIndex({"a" : "hashed"})  #刪除索引，指定"key"
{ "nIndexesWas" : 4, "ok" : 1 }
zjy:PRIMARY> db.abc.dropIndex({"b" : 1})         #刪除索引，指定"key"
{ "nIndexesWas" : 3, "ok" : 1 }
zjy:PRIMARY> db.abc.dropIndex("idx_c")           #刪除索引，指定"name"
{ "nIndexesWas" : 2, "ok" : 1 }
zjy:PRIMARY> db.abc.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test.abc"
    }
]

zjy:PRIMARY> db.abc.dropIndex("*")              #刪除索引，刪除集合的全部索引
{
    "nIndexesWas" : 4,
    "msg" : "non-_id indexes dropped for collection",
    "ok" : 1
}

4）重建索引：索引出現損壞需要重建。reindex

zjy:PRIMARY> db.abc.reIndex() #執行
{
    "nIndexesWas" : 1,
    "nIndexes" : 1,
    "indexes" : [
        {
            "key" : {
                "_id" : 1
            },
            "name" : "_id_",
            "ns" : "test.abc"
        }
    ],
    "ok" : 1
}

5）強制使用指定索引。hint

db.abc.find({"c":1,"b":2}).hint("b_1")  #hint里面是"索引字段"或則"索引名"

總結：

索引可以加快檢索、排序等操作的效率，但是對於增刪改的操作卻有一定的開銷，所以不要一味的加索引，在必要的字段上加合適的索引才是需要的。更多的信息請參考官方文檔。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 索引相關知識 MySQL表及索引相關知識 MongoDb 相關 Mongodb索引 mongodb索引 mongoDB 索引 MongoDB 索引 MongoDB—索引 mongodb索引過期索引 mongodb索引復合索引