-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CloudStack fails to start more VMs #10205
Comments
More info and logs are available in the linked discussion. |
Just noticed that after switch over of back to primary VR, there are numerous errors in VR cloud.log
and there were no outside access from VMs started. |
Additional VR issue. VM stopped getting hostname from metadata service, even though VR is accessible.
it's quite long
It's even more ugly as the lines are just repeating:
|
After network restart (with cleanup), VR booted with "clean"
|
I checked one VR in my testing env, there are a lot of duplicated lines in .htaccess this may be not a major issue |
The main issue in my env is that it's stuck too often.
Also, why I've got to it - .htaccess was broken, and metadata service stopped working. Likely, because of enormous number of writes to the file, 2 writes happened simultaneously and wrote to the same line. |
Finally I've got to a weird state when restart of libirtd, agents and management does not help.
and UI becomes very slow again. |
I will fix the issue with |
Discussed in #10184
Originally posted by akrasnov-drv January 14, 2025
Hi,
I'm struggling to make CloudStack 4.20.0.0 properly start KVM VMs on Ubuntu 22.
We have isolated network over VLAN.
CloudStack manages to start single VM and to add several more. But when I ask to start more (e.g. 10-30), Cloudstack starts behaving weird.
New VMs produce different errors, then Cloudstack becomes slow, does not clean resources, and at the end stays with number of VMs in Starting state.
I have 5 KVM servers connected, each able to handle 30 VMs alone (in KVM without Cloudstack). VMs use local server storage. I do not see any resource problem.
I tried to debug the issue, and looks like virtual router stops working properly. I found in its log that it restarts managing script at some point, still part of VMs do not get proper network config. Static NAT enable also returns errors.
Error while enabling static nat. Ip Id: 14
Expunge for VMs then also hangs.
In addition sometimes I see KVM hosts stop communicating with management, and stop writing to their local logs.
To recover I need to restart management, delete virtual router and clean stuck resources, sometimes directly in mysql db. Agent restart is also sometimes needed.
Any help to understand and fix the problem is highly appreciated.
I'll provide logs or other info on request.
Thanks,
Alex.
To summon
The text was updated successfully, but these errors were encountered: