python中級---->pymongo存儲json數據


  這里面我們介紹一下python中操作mangodb的第三方庫pymongo的使用,以及簡單的使用requests庫作爬蟲。人情冷暖正如花開花謝,不如將這種現象,想成一種必然的季節。

 

pymongo的安裝及前期准備

一、mangodb的安裝以及啟動

測試機器:win10, mangodb版本v3.4.0,python版本3.6.3。

mangodb的安裝目錄:D:\Database\DataBase\Mongo。數據的存放目錄:E:\data\database\mango\data。首先我們啟動mangodb服務器的:可以看到在本地27017端口成功啟動server。

D:\Database\DataBase\Mongo\Server\3.4\bin>mongod --dbpath E:\data\database\mango\data
2017-11-21T20:48:38.458+0800 I CONTROL  [initandlisten] MongoDB starting : pid=20484 port=27017 dbpath=E:\data\database\mango\data 64-bit host=Linux
2017-11-21T20:48:38.461+0800 I CONTROL  [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
2017-11-21T20:48:38.462+0800 I CONTROL  [initandlisten] db version v3.4.0
2017-11-21T20:48:38.463+0800 I CONTROL  [initandlisten] git version: f4240c60f005be757399042dc12f6addbc3170c1
2017-11-21T20:48:38.464+0800 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1t-fips  3 May 2016
2017-11-21T20:48:38.465+0800 I CONTROL  [initandlisten] allocator: tcmalloc
2017-11-21T20:48:38.466+0800 I CONTROL  [initandlisten] modules: none
2017-11-21T20:48:38.466+0800 I CONTROL  [initandlisten] build environment:
2017-11-21T20:48:38.467+0800 I CONTROL  [initandlisten]     distmod: 2008plus-ssl
2017-11-21T20:48:38.468+0800 I CONTROL  [initandlisten]     distarch: x86_64
2017-11-21T20:48:38.469+0800 I CONTROL  [initandlisten]     target_arch: x86_64
2017-11-21T20:48:38.469+0800 I CONTROL  [initandlisten] options: { storage: { dbPath: "E:\data\database\mango\data" } }
2017-11-21T20:48:38.491+0800 I -        [initandlisten] Detected data files in E:\data\database\mango\data created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-11-21T20:48:38.493+0800 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=5573M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-11-21T20:48:39.931+0800 I CONTROL  [initandlisten]
2017-11-21T20:48:39.933+0800 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-11-21T20:48:39.936+0800 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-11-21T20:48:39.940+0800 I CONTROL  [initandlisten]
2017-11-21T20:48:41.253+0800 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory 'E:/data/database/mango/data/diagnostic.data'
2017-11-21T20:48:41.259+0800 I NETWORK  [thread1] waiting for connections on port 27017

mangodb客戶端的啟動:D:\Database\DataBase\Mongo\Server\3.4\bin\mongo.exe。雙擊即可運行

MongoDB shell version v3.4.0
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.0
Server has startup warnings:
2017-11-21T20:48:39.931+0800 I CONTROL  [initandlisten]
2017-11-21T20:48:39.933+0800 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-11-21T20:48:39.936+0800 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-11-21T20:48:39.940+0800 I CONTROL  [initandlisten]
>

 

二、python中pymongo的安裝

pip install pymongo

這里簡單的介紹一下pymongo的使用,這里面的代碼是選自github的入門例子。

>>> import pymongo
>>> client = pymongo.MongoClient("localhost", 27017)
>>> db = client.test
>>> db.name
u'test'
>>> db.my_collection
Collection(Database(MongoClient('localhost', 27017), u'test'), u'my_collection')
>>> db.my_collection.insert_one({"x": 10}).inserted_id
ObjectId('4aba15ebe23f6b53b0000000')
>>> db.my_collection.insert_one({"x": 8}).inserted_id
ObjectId('4aba160ee23f6b543e000000')
>>> db.my_collection.insert_one({"x": 11}).inserted_id
ObjectId('4aba160ee23f6b543e000002')
>>> db.my_collection.find_one()
{u'x': 10, u'_id': ObjectId('4aba15ebe23f6b53b0000000')}
>>> for item in db.my_collection.find():
...     print(item["x"])
...
10
8
11
>>> db.my_collection.create_index("x")
u'x_1'
>>> for item in db.my_collection.find().sort("x", pymongo.ASCENDING):
...     print(item["x"])
...
8
10
11
>>> [item["x"] for item in db.my_collection.find().limit(2).skip(1)]
[8, 11]

 

pymongo的使用例子

一、python爬蟲以及pymongo存儲數據

import requests
import pymongo
import json

def requestData():
    url = 'http://****.com/*.do'
    data = {
        'projectId': 90,
        'myTaskFlag': 1,
        'userId': 40
    }
    json_data = requests.post(url, data=json.dumps(data)).json()
    return json_data

def output_data(json_data):
    client = pymongo.MongoClient(host='localhost', port=27017)
    db = client.test
    collection = db.tasks
    tasks_data = json_data.get('taskList')
    collection.insert(tasks_data)
    client.close()

if __name__ == '__main__':
    json_data = requestData()
    output_data(json_data)

我們把得到的數據存放在tasks集合中,這里使用的是mangodb默認的test數據庫。運行完程序,我們可以通過mangodb的客戶端查看數據,運行:db.tasks.find().pretty()可以查詢tasks集合的所有數據。

{
        "_id" : ObjectId("5a1427a2edc9f04be40bc02d"),
        "taskId" : 1,
        "summary" : "PC版“個人信息”頁面優化",
        "status" : 8,
        "categoryId" : 3,
        "creatorId" : 7,
        "projectId" : 1,
        "dateSubmit" : NumberLong("1481105108000"),
        "level" : 1,
        "handlerId" : 2,
        "ViewState" : 2,
        "priority" : 2
} {
        "_id" : ObjectId("5a1427a2edc9f04be40bc02e"),
        "taskId" : 2,
        "summary" : "PC版“添加新任務”界面字體太大",
        "status" : 8,
        "categoryId" : 3,
        "creatorId" : 7,
        "projectId" : 1,
        "dateSubmit" : NumberLong("1481105195000"),
        "level" : 1,
        "handlerId" : 2,
        "ViewState" : 2,
        "priority" : 1
}

 

友情鏈接

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM