最近做一些數據庫調研的工作,目標是實現影像更快的入庫、出庫、查詢,並實現並行訪問等操作。
將結果總結成一個mongoImg類,也算是小結吧。
1 ''' 2 Created on 2013-8-6 3 class mongoInsert 4 @author: tree 5 ''' 6 __metaclass__ = type 7 8 import os 9 from pymongo.database import Database 10 import time 11 import gridfs 12 13 class mongoImg(object): 14 """mongoInsert is a class for inserting document 15 16 17 """ 18 def __init__(self, database, dir): 19 """Create a new instance of :class:mongoInsert 20 :Parameters: 21 - `database`: database to use 22 - `dir` : directory of document 23 """ 24 if not isinstance(database, Database): 25 raise TypeError("database must be an instance of Database") 26 if len(dir) < 1: 27 raise TypeError("dir must be an string of directory") 28 29 # self.__con = Connection() 30 self.__imgdb = database 31 self.__imgfs = gridfs.GridFS (self.__imgdb) 32 self.__dir = dir 33 self.__filelist=[] 34 35 #save filepath in list.txt 36 def __dirwalk(self,topdown=True): 37 """traverse the documents of self.__dir and save in self.__filelist 38 """ 39 sum=0 40 self.__filelist.clear() 41 42 for root,dirs,files in os.walk(self.__dir,topdown): 43 for name in files: 44 sum+=1 45 temp=os.path.join(root,name) 46 self.__filelist.append(temp) 47 print(sum) 48 49 #insert image 50 def insert(self): 51 """insert images in mongodb 52 """ 53 self.__dirwalk() 54 55 tStart = time.time() 56 for fi in self.__filelist: 57 with open (fi,'rb') as myimage: 58 data=myimage.read() 59 self.__imgfs.put(data, content_type = "jpg", filename =fi) 60 61 tEnd =time.time () 62 print ("It cost %f sec" % (tEnd - tStart)) 63 64 #get image by filename 65 def getbyname(self,filename,savepath): 66 """get img from mongdb by filename 67 """ 68 if len(savepath) < 1: 69 raise TypeError("dir must be an string of directory") 70 dataout=self.__imgfs.get_version(filename) 71 try: 72 imgout=open(savepath,'wb') 73 data=dataout.read() 74 imgout.write(data) 75 finally: 76 imgout.close() 77
使用示例:也可以將數據庫連接寫在類內部
1 from pymongo import Connection 2 import mongoImg 3 4 filedir=r'D:\image' 5 con = Connection() 6 db = con.imgdb 7 imgmongo=mongoImg.mongoImg(db,filedir) 8 imgmongo.insert()
感覺mongodb存儲影像切片還是蠻快的,1w多個圖片,大約100-200秒左右。
tip:
gridfs.GridFS.put 函數
put(data, **kwargs) Put data in GridFS as a new file. Equivalent to doing: try: f = new_file(**kwargs) f.write(data) finally f.close()
在存儲讀取圖像時,犯了低級錯誤,將open得到的file實例當做數據存儲,讀取的時候怎么也讀不出數據。。。囧
另外以字節流形式讀取圖像數據比較適合。
pipe = open('/dev/input/js0','rb')
如果以str形式存儲的話,可能會出現UnicodeDecodeError錯誤,貌似是因為圖像數據有些超出了python默認編碼的存儲區間。
ps:初學python 數據庫操作也忘得差不多 歡迎大家批評和指正~