If you’ve been running OpenStack from the developer trunk (as per my blog) you will occasionally come across some bugs. This is the nature of the beast for running bleeding edge code.
So how do you track down a solution for them?
Step 1. Check the logs
First place to look is in /var/log/nova where you will see the logs related to OpenStack.
Some bugs will be related to changes in the software, so maybe an extra config line is needed in /etc/nova/nova.conf.
For example you may have seen this in /var/log/nova/nova-network.log:
2011-03-15 17:33:35,732 CRITICAL nova [-] failed to create /usr/lib/pymodules/python2.6/cloud2.MainThread-18360 (nova): TRACE: Traceback (most recent call last): (nova): TRACE: File "/usr/bin/nova-network", line 48, in (nova): TRACE: service.serve() (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 284, in serve (nova): TRACE: x.start() (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/service.py", line 84, in start (nova): TRACE: self.manager.init_host() (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 489, in init_host (nova): TRACE: super(VlanManager, self).init_host() (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/manager.py", line 125, in init_host (nova): TRACE: self.driver.init_host() (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/network/linux_net.py", line 394, in init_host (nova): TRACE: iptables_manager.apply() (nova): TRACE: File "/usr/lib/pymodules/python2.6/nova/utils.py", line 523, in inner (nova): TRACE: with lock: (nova): TRACE: File "/usr/lib/pymodules/python2.6/lockfile.py", line 223, in __enter__ (nova): TRACE: self.acquire() (nova): TRACE: File "/usr/lib/pymodules/python2.6/lockfile.py", line 239, in acquire (nova): TRACE: raise LockFailed("failed to create %s" % self.unique_name) (nova): TRACE: LockFailed: failed to create /usr/lib/pymodules/python2.6/cloud2.MainThread-18360 (nova): TRACE:
There was a change between releases that required the following lines present in /etc/nova/nova.conf to solve this:
--state_path=/var/lib/nova --lock_path=/var/lock/nova
Step 2. Check https://bugs.launchpad.net/nova and https://bugs.launchpad.net/swift
A recent one I came across this morning was the following:
2011-03-17 08:49:19,160 ERROR nova.api [GXEJM3P1HVP7N53IGI5J admin myproject] Unexpected error raised: invalid literal for int() with base 16: 'ami-jqxvgtmd' (nova.api): TRACE: Traceback (most recent call last): (nova.api): TRACE: File "/usr/lib/pymodules/python2.6/nova/api/ec2/__init__.py", line 318, in __call__ (nova.api): TRACE: result = api_request.invoke(context) (nova.api): TRACE: File "/usr/lib/pymodules/python2.6/nova/api/ec2/apirequest.py", line 150, in invoke (nova.api): TRACE: result = method(context, **args) (nova.api): TRACE: File "/usr/lib/pymodules/python2.6/nova/api/ec2/cloud.py", line 906, in describe_images (nova.api): TRACE: images = self.image_service.detail(context) (nova.api): TRACE: File "/usr/lib/pymodules/python2.6/nova/image/s3.py", line 76, in detail (nova.api): TRACE: images = self.service.detail(context) (nova.api): TRACE: File "/usr/lib/pymodules/python2.6/nova/image/local.py", line 58, in detail (nova.api): TRACE: for image_id in self._ids(): (nova.api): TRACE: File "/usr/lib/pymodules/python2.6/nova/image/local.py", line 50, in _ids (nova.api): TRACE: return [int(i, 16) for i in os.listdir(self._path)] (nova.api): TRACE: ValueError: invalid literal for int() with base 16: 'ami-jqxvgtmd' (nova.api): TRACE:
I found this related bug: https://bugs.launchpad.net/nova/+bug/735641 by searching the bug database for the error. In this case the solution is to remove my images from my objectstore and re-upload them due to changes in how the images are stored and retrieved.
Step 3. Always a good place to go is on IRC @ freenode.net
(http://webchat.freenode.net/) and join #openstack where the developers and contributors will answer your questions. Have patience though, they do have work to do.
Step 4. You can also ask questions on Launchpad: https://answers.launchpad.net/nova/+addquestion (and similar for swift).
I also find its handy to not be too vague – describe your set up instead of saying “I launch an instance and it’s stuck on “Scheduling, can you help?” doesn’t give anyone any details of why that could be the case. As you can imagine, this could be anything from user-error, hardware errors, software errors or misconfigured environments, etc. – all requiring many different ways to troubleshoot so help yourself by being more specific.
For more information check out the OpenStack Wiki on contributing to the project: http://wiki.openstack.org/HowToContribute and this information on support and troubleshooting from the docs: http://docs.openstack.org/openstack-compute/admin/content/ch08.html
Recent Comments