以下主要为安装部署过程中遇到的一些问题,因为openstack版本问题,带来的组件差异导致不同的版本安装的方法也完全不一样。经过测试,目前已可成 功部署Essex和Grizzly两个版本,其中间还有个版本是Folsom,这个版本没有部署成功,也没有花太多时间去研究,因为Folsom版本中使 用的quantum组件还不成熟,对于网络连通性还有很多问题,网上也很少有成功的案例,大多数人使用的还是folsom+nova-network模 式。
到了Grizzly版本,quantum组件才比较稳定,可以正常使用,自己也花了很多时间研究,现在已可以成功部署多节点环境。以下是部署过程中遇到的 一些问题,包括Essex和Grizzly两个版本。国内网上关于这方面的资料很少,很多资料也都是国外网站上看到的。而且很多情况下日志错误信息相同, 但导致错误的原因却不尽相同,这时候就需要仔细分析其中的原理,才能准确定位。遇到错误并不可怕,我们可以通过对错误的排查加深对系统的理解,这样也是好 事。
关于安装部署,网上有一些自动化的部署工具,如devstack和onestack,一键式部署。如果你是初学者,并不建议你使用这些工具,很明显,这样 你学不到任何东西,不会有任何收获。如果没有问题可以暂时恭喜你一下,一旦中间环节出现错误信息,你可能一头雾水,根本不知道是哪里错了,加之后期的维护 也是相当困难的。你可能需要花更多的时间去排查故障。因为你根本不了解中间经过了哪些环节,需要做哪些配置!这些工具大多数是为了快速部署开发环境所用, 正真生产环境还需要我们一步一步来操作。这样有问题也可快速定位排查错误。
本文仅是针对部署过程中的一些错误信息进行总结梳理,并给予解决办法,这些情况是在我的环境里遇到的,并成功解决的,可能会因为环境的不同而有所差异,仅供参考。
1、检查服务是否正常
- root@control:~# nova-manage service list
- Binary Host Zone Status State Updated_At
- nova-cert control internal enabled :-) 2013-04-26 02:29:44
- nova-conductor control internal enabled :-) 2013-04-26 02:29:42
- nova-consoleauth control internal enabled :-) 2013-04-26 02:29:44
- nova-scheduler control internal enabled :-) 2013-04-26 02:29:47
- nova-compute node-01 nova enabled :-) 2013-04-26 02:29:46
- nova-compute node-02 nova enabled :-) 2013-04-26 02:29:46
- nova-compute node-03 nova enabled :-) 2013-04-26 02:29:42
- python2.7/dist-packages/nova/virt/libvirt/connection.py”, line 338, in _connect
- 2013-03-0917:05:42 TRACE nova return libvirt.openAuth(uri, auth, 0)
- 2013-03-09 17:05:42 TRACE nova File “/usr/lib/python2.7/dist-packages/libvirt.py”, line 102, in openAuth
- 2013-03-09 17:05:42 TRACE nova if ret is None:raise libvirtError(‘virConnectOpenAuth() failed’)
- 2013-03-09 17:05:42 TRACE nova libvirtError: Failed to connect socket to ‘/var/run/libvirt/libvirt-sock’: No such file or directory
- 2013-03-09 22:05:41.909+0000: 12466: info : libvirt version: 0.9.8
- 2013-03-09 22:05:41.909+0000: 12466: error : virNetServerMDNSStart:460 : internal error Failed to create mDNS client: Daemon not running
- Error:
- Failed to add image. Got error:
The request returned 500 Internal Server Error
- OS_AUTH_KEY=”openstack”
- OS_AUTH_URL=”http://localhost:5000/v2.0/”
- OS_PASSWORD=”openstack”
- OS_TENANT_NAME=”admin”
- OS_USERNAME=”admin”
- Nova instance not found
- Local file storage of the image files.
- Error:
- 2013-03-09 17:58:08 TRACE nova raise exception.InstanceNotFound(instance_id=instance_name)
- 2013-03-09 17:58:08 TRACE nova InstanceNotFound: Instance instance-00000002 could not be found.
- 2013-03-09 17:58:08 TRACE nova
- $mysql –u root –p
- DROP DATABASE nova;
- Recreate the DB:
- CREATE DATABASE nova; (strip formatting if you copy and paste any of this)
- GRANT ALL PRIVILEGES ON nova.* TO ‘novadbadmin’@'%’ IDENTIFIED BY ‘<password>’;
- Quit
- Resync DB
- #!/bin/bash
- mysql -uroot -pmysql << _ESXU_
- use nova;
- DELETE a FROM nova.security_group_instance_association
- AS a INNER JOIN nova.instances AS b
- ON a.instance_uuid=b.id where b.uuid='$1';
- DELETE FROM nova.instance_info_caches WHERE instance_uuid='$1';
- DELETE FROM nova.instances WHERE uuid='$1';
- _ESXU_
- Error
- root@openstack-dev-r910:/home/brent/openstack# ./keystone_data.sh
- No handlers could be found for logger “keystoneclient.client”
- Unable to authorize user
- No handlers could be found for logger “keystoneclient.client”
- Unable to authorize user
- No handlers could be found for logger “keystoneclient.client”
- Unable to authorize user
- 2012-07-24 14:33:08 TRACE nova.rpc.amqp ProcessExecutionError: Unexpected error while running command.
- 2012-07-24 14:33:08 TRACE nova.rpc.amqp Command: sudo nova-rootwrap iscsiadm -m node -T iqn.2010-10.org.openstack:volume-00000011 -p 192.168.0.23:3260 –rescan
- 2012-07-24 14:33:08 TRACE nova.rpc.amqp Exit code: 255
- 2012-07-24 14:33:08 TRACE nova.rpc.amqp Stdout: ”
- 2012-07-24 14:33:08 TRACE nova.rpc.amqp Stderr: ‘iscsiadm: No portal found.\n’
- Authorization Failed: Unable to communicate with identity service: {"error": {"message": "An unexpected error prevented the server from fulfilling your request. Command 'openssl' returned non-zero exit status 3", "code": 500, "title": "Internal Server Error"}}. (HTTP 500)
Authorization Failed: Unable to communicate with identity service: {"error": {"message": "An unexpected error prevented the server from fulfilling your request. Command 'openssl' returned non-zero exit status 3", "code": 500, "title": "Internal Server Error"}}. (HTTP 500)
2677 2013-03-04 12:40:58 ERROR [keystone.common.cms] Signing error: Error opening signer certificate /etc/keystone/ssl/certs/signing_cert.pem2678 139803495638688:error:02001002:system library:fopen:No such file or directory:bss_file.c:398:fopen('/etc/keystone/ssl/certs/signing_cert.pem','r')2679 139803495638688:error:20074002:BIO routines:FILE_CTRL:system lib:bss_file.c:400:2680 unable to load certificate2682 2013-03-04 12:40:58 ERROR [root] Command 'openssl' returned non-zero exit status 32683 Traceback (most recent call last):2684 File "/usr/lib/python2.7/dist-packages/keystone/common/wsgi.py", line 231, in __call__2685 result = method(context, **params)2686 File "/usr/lib/python2.7/dist-packages/keystone/token/controllers.py", line 118, in authenticate2687 CONF.signing.keyfile)2688 File "/usr/lib/python2.7/dist-packages/keystone/common/cms.py", line 140, in cms_sign_token2689 output = cms_sign_text(text, signing_cert_file_name, signing_key_file_name)2690 File "/usr/lib/python2.7/dist-packages/keystone/common/cms.py", line 135, in cms_sign_text2691 raise subprocess.CalledProcessError(retcode, "openssl")2692 CalledProcessError: Command 'openssl' returned non-zero exit status 3
token_format = UUID
- kvm -m 512 -boot d –drive
-
- ile=win2003server.img,cache=writeback,if=virtio,boot=on -fda virtio-win-1.1.16.vfd -cdrom w
- kvm -m 1024 –drive file=win2003server.img,if=virtio,
- boot=on -cdrom virtio-win-0.1-30.iso -net nic,model=virtio -net user -boot c -nographic -vnc 8
这 里需要注意的地方是if=virtio,boot=on –fda virtio-win-1.1.16.vfd和引导系统时使用的virtio-win-0.1-30.iso 这两个驱动分别是硬盘和网卡驱动。如果不加载这两个驱动安装时会发现找不到硬盘,并且用制作好的镜像生成实例也会发现网卡找不到驱动,所以在这里安装镜像 生成后需要重新引导镜像安装更新网卡驱动为virtio。
11、删除僵尸volume
如果cinder服务不正常,我们在创建volume时会产生一些僵尸 volume,如果在horizon中无法删除的话,我们需要到服务器上去手动删除,命令:lvremove /dev/nova-volumes/volume-000002注意这里一定要写完整的路径,不然无法删除,如果删除提示:“Can't remove open logical volume“ 可尝试将相关服务stop掉,再尝试删除。删除完还需到数据库cinder的volumes表里清除相关记录。