Python之路,Day20 - 分布式監控系統開發


本節內容

為什么要做監控? 

常用監控系統設計討論

監控系統架構設計

監控表結構設計

 

 

為什么要做監控? 

–熟悉IT監控系統的設計原理
–開發一個簡版的類Zabbix監控系統
–掌握自動化開發項目的程序設計思路及架構解藕原則
 

常用監控系統設計討論

Zabbix
Nagios
 

監控系統需求討論

1.可監控常用系統服務、應用、網絡設備等
2.一台主機上可監控多個不同服務、不同服務的監控間隔可不同
3.同一個服務在不同主機上的監控間隔、報警閾值可不同
4.可以批量的給一批主機添加、刪除、修改要監控的服務
5.告警級別:
  • 不同的服務 因為業務重要程度不同,如果出了問題可以設置不同的報警級別
  • 可以指定特定的服務或告警級別的事件通知給特定的用戶
  • 告警的升級設定

 

6.歷史數據 的存儲和優化
  • 實現用最少的空間占用量存儲最多的有效數據
  • 如何做到1s中之內取出一台主機上所有服務的5年的監控數據?

7. 數據可視化,如何做出簡潔美觀的用戶界面?

8.如何實現單機支持5000+機器監控需求?
9.采取何種通信方式?主動、被動?
10.如何實現監控服務器的水平擴展?
 
 

采用什么架構?

•Mysql
•主動通信? Snmp,wget…
•被動通信?Agent ---how to communicate with the monitor server
•Socket server –>  Sockect client
•能否用現成的c/s架構? Rabbit mq, redis 訂閱發布, http ?
 

采用HTTP好處

1.接口設計簡單

2.容易水平擴展做分布式

3.Socket穩定成熟,省去較多的通信維護精力

 

Http特性:

1.短連接

2.無狀態

3.安全認證

4.被動通信

 

 

監控系統架構設計

 

 
 
 
 

表結構設計 

  1 #!_*_coding:utf8_*_
  2 from django.db import models
  3 
  4 # Create your models here.
  5 
  6 
  7 
  8 
  9 class Host(models.Model):
 10     name =  models.CharField(max_length=64,unique=True)
 11     ip_addr =  models.GenericIPAddressField(unique=True)
 12     host_groups = models.ManyToManyField('HostGroup',blank=True) # A B C
 13     templates = models.ManyToManyField("Template",blank=True) # A D E
 14     monitored_by_choices = (
 15         ('agent','Agent'),
 16         ('snmp','SNMP'),
 17         ('wget','WGET'),
 18     )
 19     monitored_by = models.CharField(u'監控方式',max_length=64,choices=monitored_by_choices)
 20     status_choices= (
 21         (1,'Online'),
 22         (2,'Down'),
 23         (3,'Unreachable'),
 24         (4,'Offline'),
 25     )
 26     status = models.IntegerField(u'狀態',choices=status_choices,default=1)
 27     memo = models.TextField(u"備注",blank=True,null=True)
 28 
 29     def __unicode__(self):
 30         return self.name
 31 
 32 class HostGroup(models.Model):
 33     name =  models.CharField(max_length=64,unique=True)
 34     templates = models.ManyToManyField("Template",blank=True)
 35     memo = models.TextField(u"備注",blank=True,null=True)
 36     def __unicode__(self):
 37         return self.name
 38 
 39 class ServiceIndex(models.Model):
 40     name = models.CharField(max_length=64)
 41     key =models.CharField(max_length=64)
 42     data_type_choices = (
 43         ('int',"int"),
 44         ('float',"float"),
 45         ('str',"string")
 46     )
 47     data_type = models.CharField(u'指標數據類型',max_length=32,choices=data_type_choices,default='int')
 48     memo = models.CharField(u"備注",max_length=128,blank=True,null=True)
 49     def __unicode__(self):
 50         return "%s.%s" %(self.name,self.key)
 51 
 52 class Service(models.Model):
 53     name = models.CharField(u'服務名稱',max_length=64,unique=True)
 54     interval = models.IntegerField(u'監控間隔',default=60)
 55     plugin_name = models.CharField(u'插件名',max_length=64,default='n/a')
 56     items = models.ManyToManyField('ServiceIndex',verbose_name=u"指標列表",blank=True)
 57     memo = models.CharField(u"備注",max_length=128,blank=True,null=True)
 58 
 59     def __unicode__(self):
 60         return self.name
 61     #def get_service_items(obj):
 62     #    return ",".join([i.name for i in obj.items.all()])
 63 
 64 class Template(models.Model):
 65     name = models.CharField(u'模版名稱',max_length=64,unique=True)
 66     services = models.ManyToManyField('Service',verbose_name=u"服務列表")
 67     triggers = models.ManyToManyField('Trigger',verbose_name=u"觸發器列表",blank=True)
 68     def __unicode__(self):
 69         return self.name
 70 '''
 71 class TriggerExpression(models.Model):
 72     name = models.CharField(u"觸發器表達式名稱",max_length=64,blank=True,null=True)
 73     service = models.ForeignKey(Service,verbose_name=u"關聯服務")
 74     service_index = models.ForeignKey(ServiceIndex,verbose_name=u"關聯服務指標")
 75     logic_type_choices = (('or','OR'),('and','AND'))
 76     logic_type = models.CharField(u"邏輯關系",choices=logic_type_choices,max_length=32,blank=True,null=True)
 77     left_sibling = models.ForeignKey('self',verbose_name=u"左邊條件",blank=True,null=True,related_name='left_sibling_condition' )
 78     operator_type_choices = (('eq','='),('lt','<'),('gt','>'))
 79     operator_type = models.CharField(u"運算符",choices=operator_type_choices,max_length=32)
 80     data_calc_type_choices = (
 81         ('avg','Average'),
 82         ('max','Max'),
 83         ('hit','Hit'),
 84         ('last','Last'),
 85     )
 86     data_calc_func= models.CharField(u"數據處理方式",choices=data_calc_type_choices,max_length=64)
 87     data_calc_args = models.CharField(u"函數傳入參數",help_text=u"若是多個參數,則用,號分開,第一個值是時間",max_length=64)
 88     threshold = models.IntegerField(u"閾值")
 89 
 90     def __unicode__(self):
 91         return "%s %s(%s(%s))" %(self.service_index,self.operator_type,self.data_calc_func,self.data_calc_args)
 92 '''
 93 
 94 
 95 class TriggerExpression(models.Model):
 96     #name = models.CharField(u"觸發器表達式名稱",max_length=64,blank=True,null=True)
 97     trigger = models.ForeignKey('Trigger',verbose_name=u"所屬觸發器")
 98     service = models.ForeignKey(Service,verbose_name=u"關聯服務")
 99     service_index = models.ForeignKey(ServiceIndex,verbose_name=u"關聯服務指標")
100     specified_index_key = models.CharField(verbose_name=u"只監控專門指定的指標key",max_length=64,blank=True,null=True)
101     operator_type_choices = (('eq','='),('lt','<'),('gt','>'))
102     operator_type = models.CharField(u"運算符",choices=operator_type_choices,max_length=32)
103     data_calc_type_choices = (
104         ('avg','Average'),
105         ('max','Max'),
106         ('hit','Hit'),
107         ('last','Last'),
108     )
109     data_calc_func= models.CharField(u"數據處理方式",choices=data_calc_type_choices,max_length=64)
110     data_calc_args = models.CharField(u"函數傳入參數",help_text=u"若是多個參數,則用,號分開,第一個值是時間",max_length=64)
111     threshold = models.IntegerField(u"閾值")
112 
113 
114     logic_type_choices = (('or','OR'),('and','AND'))
115     logic_type = models.CharField(u"與一個條件的邏輯關系",choices=logic_type_choices,max_length=32,blank=True,null=True)
116     #next_condition = models.ForeignKey('self',verbose_name=u"右邊條件",blank=True,null=True,related_name='right_sibling_condition' )
117     def __unicode__(self):
118         return "%s %s(%s(%s))" %(self.service_index,self.operator_type,self.data_calc_func,self.data_calc_args)
119     class Meta:
120         pass #unique_together = ('trigger_id','service')
121 
122 class Trigger(models.Model):
123     name = models.CharField(u'觸發器名稱',max_length=64)
124     #expressions= models.TextField(u"表達式")
125     severity_choices = (
126         (1,'Information'),
127         (2,'Warning'),
128         (3,'Average'),
129         (4,'High'),
130         (5,'Diaster'),
131     )
132     #expressions = models.ManyToManyField(TriggerExpression,verbose_name=u"條件表達式")
133     severity = models.IntegerField(u'告警級別',choices=severity_choices)
134     enabled = models.BooleanField(default=True)
135     memo = models.TextField(u"備注",blank=True,null=True)
136 
137     def __unicode__(self):
138         return "<serice:%s, severity:%s>" %(self.name,self.get_severity_display())
139 
140 
141 
142 class Action(models.Model):
143     name =  models.CharField(max_length=64,unique=True)
144     host_groups = models.ManyToManyField('HostGroup',blank=True)
145     hosts = models.ManyToManyField('Host',blank=True)
146 
147     conditions = models.TextField(u'告警條件')
148     interval = models.IntegerField(u'告警間隔(s)',default=300)
149     operations = models.ManyToManyField('ActionOperation')
150 
151     recover_notice = models.BooleanField(u'故障恢復后發送通知消息',default=True)
152     recover_subject = models.CharField(max_length=128,blank=True,null=True)
153     recover_message = models.TextField(blank=True,null=True)
154 
155     enabled = models.BooleanField(default=True)
156 
157     def __unicode__(self):
158         return self.name
159 
160 class ActionOperation(models.Model):
161     name =  models.CharField(max_length=64)
162     step = models.SmallIntegerField(u"第n次告警",default=1)
163     action_type_choices = (
164         ('email','Email'),
165         ('sms','SMS'),
166         ('script','RunScript'),
167     )
168     action_type = models.CharField(u"動作類型",choices=action_type_choices,default='email',max_length=64)
169     #notifiers= models.ManyToManyField(host_models.UserProfile,verbose_name=u"通知對象",blank=True)
170     def __unicode__(self):
171         return self.name
172 
173 
174 class Maintenance(models.Model):
175     name =  models.CharField(max_length=64,unique=True)
176     hosts = models.ManyToManyField('Host',blank=True)
177     host_groups = models.ManyToManyField('HostGroup',blank=True)
178     content = models.TextField(u"維護內容")
179     start_time = models.DateTimeField()
180     end_time = models.DateTimeField()
181 
182     def __unicode__(self):
183         return self.name
184 
185 ''''
186 CPU
187     idle 80
188     usage  90
189     system  30
190     user
191     iowait  50
192 
193 memory :
194     usage
195     free
196     swap
197     cache
198     buffer
199 
200 load:
201     load1
202     load 5
203     load 15
204 '''
View Code

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM