Saltstack之salt-master的打開文件數問題


一、引言:

單個salt-master下的minion數已經達到2101個了,所以在master日志有如下的提示:

2016-09-09 11:36:22,221 [salt.utils.verify][CRITICAL][10919] The number of accepted minion keys(2101) should be lower than 1/4 of the max open files soft setting(4096). Please consider raising this value.

如果不能解決這個問題將無數加入新節點。從日志中可以看出max open files的值是4096,很奇怪!

通過ulimit -a看到open files是65535,從這里聯想到是不是salt得問題?

二、解決問題:

在度娘和G哥上一頓搜索,該github上saltstack有一個issues:

salt-master not recognizing max files increase #5323

在/usr/lib/python2.7/site-packages/salt/utils/verify.py腳本check_max_open_files的函數,具體如下:

def check_max_open_files(opts):
    '''
    Check the number of max allowed open files and adjust if needed
    '''
    mof_c = opts.get('max_open_files', 100000)
    if sys.platform.startswith('win'):
        # Check the Windows API for more detail on this
        # http://msdn.microsoft.com/en-us/library/xt874334(v=vs.71).aspx
        # and the python binding http://timgolden.me.uk/pywin32-docs/win32file.html
        mof_s = mof_h = win32file._getmaxstdio()
    else:
        mof_s, mof_h = resource.getrlimit(resource.RLIMIT_NOFILE)

    accepted_keys_dir = os.path.join(opts.get('pki_dir'), 'minions')
    accepted_count = len(os.listdir(accepted_keys_dir))

    log.debug(
        'This salt-master instance has accepted {0} minion keys.'.format(
            accepted_count
        )
    )

    level = logging.INFO

    if (accepted_count * 4) <= mof_s:
        # We check for the soft value of max open files here because that's the
        # value the user chose to raise to.
        #
        # The number of accepted keys multiplied by four(4) is lower than the
        # soft value, everything should be OK
        return

    msg = (
        'The number of accepted minion keys({0}) should be lower than 1/4 '
        'of the max open files soft setting({1}). '.format(
            accepted_count, mof_s
        )
    )
    with open("/tmp/openfile.txt","a") as f:
        f.write("mof_s-->%s\n"%mof_s)
        f.write("accepted_count-->%s\n"%accepted_count)

    if accepted_count >= mof_s:
        # This should never occur, it might have already crashed
        msg += 'salt-master will crash pretty soon! '
        level = logging.CRITICAL
    elif (accepted_count * 2) >= mof_s:
        # This is way too low, CRITICAL
        level = logging.CRITICAL
    elif (accepted_count * 3) >= mof_s:
        level = logging.WARNING
        # The accepted count is more than 3 time, WARN
    elif (accepted_count * 4) >= mof_s:
        level = logging.INFO

    if mof_c < mof_h:
        msg += ('According to the system\'s hard limit, there\'s still a '
                'margin of {0} to raise the salt\'s max_open_files '
                'setting. ').format(mof_h - mof_c)

    msg += 'Please consider raising this value.'
    log.log(level=level, msg=msg)

通過resource.getrlimit(resource.RLIMIT_NOFILE)得到軟和硬的兩種打開最大文件數,單獨執行該方法:

>>> import resource
>>> resource.getrlimit(resource.RLIMIT_NOFILE)
(65535, 65535)

很奇怪,為什么單獨執行是65535,而salt執行出來的是4096。

線上有多個salt-master,正好操作系統的版本是不一樣的,經過檢查發現只有centos7 以上的才會出現這種情況,那就是系統的問題了。

在centos5/6等版本中,資源限制的配置可以在/etc/security/limits.conf設置,針對root/user等各個用戶或者*代表所有用戶來設置。當然,/etc/security/limits.d/中可以配置,系統是先加載limits.conf然后按照英文字母順序加載limits.d目錄下的配置文件,后加載配置覆蓋之前的配置。

不過在centos7/rhel7的系統中,使用Systemd替代了之前的SysV,因此/etc/security/limits.conf文件的配置作用域縮小了一些。limits.conf這里的配置,只適用於通過PAM認證登錄用戶的資源限制,它對systemd的service的資源限制不生效。登錄用戶的限制,與上面講的一樣,通過/etc/security/limits.conf和limits.d來配置即可。

對於systemd services的資源限制,如何配置呢?

全局的配置,放在文件/etc/systemd/system.conf和/etc/systemd/user.conf。同時,也會加載兩個對應的目錄中的所有.conf文件/etc/systemd/system.conf.d/*.conf和/etc/systemd/user.conf.d/*.conf。其中,system.conf是系統實例使用的,user.conf是用戶實例使用的。一般的service使用system.conf中的配置即可。system.conf.d/*.conf中的配置會覆蓋system.conf。

但是如果修改/etc/systemd/system.conf的話需要重啟系統才會生效。

針對單個service,可以直接設置它自己的:

 然后運行如下命令,才能生效:

sudo systemctl daemon-reload
sudo systemctl restart salt-master.service

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM