Finally, I’ve been able to record a video showing how the QoS service plugin works.If you want to deploy this follow the instructions under the video. (open in vimeo for better quality: https://vimeo.com/136295066)
now create rules to allow traffic to the VM port 22 & ICMP
neutron security-group-rule-create --direction ingress \
--port-range-min 22 \
--port-range-max 22 \
neutron security-group-rule-create --protocol icmp \
--direction ingress \
nova boot --image cirros-0.3.4-x86_64-uec --flavor m1.tiny \
--nic net-id=*your-net-id* qos-cirros
nova show qos-cirros # look for the IP
neutron port-list # look for the IP and find your *port id*
In yet another console, look for the port and monitor it
# given a port id 49d4a680-4236-4d0c-9feb-8b4990ac35b9# look for the ovs port:$ sudo ovs-vsctl show | grep qvo49d4a680-42
finally, try the QoS rules
neutron qos-policy-create bw-limiter
neutron qos-bandwidth-limit-rule-create *rule-id* bw-limiter \
--max-kbps 3000 --max-burst-kbps 300
# after next command, the port will quickly go down to 3Mbps
neutron port-update *your-port-id* --qos-policy bw-limiter
You can change rules in runtime, and ports will be updated
Last week we had the Openstack Neutron Quality of
Service coding sprint in Ra’‘anana, Israel to work on .
It’s been an amazing experience, we’ve acomplished a lot, but we
still have a lot ahead.We gathered together at Red Hat office for three days ,
delivering almost (sigh!) the full stack for the QoS service
with bandwidth limiting.The first day we had a short meeting where we went over
the whole picture of blocks and dependencies that we had
The people from Huawei India (hi Vikram Choudhary & Ramanjaneya Reddy) helped us
remotely by bootstraping the DB models and the neutron client.
Eran Gampel (Huawei), Irena Berezovsky (Midokura) and Mike Kolesnik (Red Hat)
revised the API for REST consistency during the first day, provided an
amendment to the original spec , the API extension and
the service plugin  Concurrently John Schwarz (Red Hat) was working on the API tests
which acted as validation of the work they were doing.
Ihar Hrachyshka (Red Hat) finished the DB models and submited the first neutron
versioned objects ever on top of the DB models, I
recomend reading those patches, they are like nirvana
of coding ;).
Mike Kolesnik plugged the missing callbacks for extending
networks and ports. Some of those, extending object reads
will be moved to a new neutron.callbacks interface.I mostly worked on coordination and writing some code
for the generic RPC callbacks  to be used with versioned objects,
where I had lots of help from Eran and Moshe Levi (Mellanox), the current
version is very basic, not supporting object updates but initial
retrieval of the resources, hence not a real callback ;) (yet!).
Eran wrote a pluggable driver backend interface for the service,
 with a default rpc/messaging backend which fitted very nicely.
Gal Sagie (Huawei) and Moshe Levi worked at the agent level, Gal created
the QoS OvS library with the ability to manipulate queues, configure
the limits, and attach those queues to ports , Moshe leaded
the agent design, providing an interface for dynamic agent extensions ,
a QoS agent extension interface , and the example for SRIOV ,
Gal then coded the OvS QoS extension driver .
During the last day, we tried to put all the pieces together, John
was debugging API->SVC->vo->DB (you’d be amazed if you saw him
going through vim or ipdb at high speed). Ihar was polishing the models
and versioned objects, Mike was polishing the callbacks, and I was
tying together the agent side. We were not able to fully assemble
a POC in the end, but we were able to interact with neutron client
to the server across all the layers. And the agent side was looking
good but I managed to destroy the environment I was using, so I will
be working on it next week.The plan aheadWe need to assemble the basic POC, make a checklist for missing tests and TODO(QoS), and start enforcing full testing for any other non-poc-essential patch.Doing it as I write: https://etherpad.openstack.org/p/neutron-qos-testing-gapsOnce that’s done we may be ready to merge back during the end of liberty-2, or the very start of next one: liberty-3. Since QoS is designed as a separate service, most of the pieces won’t be activated unless explicitly installed, which makes it very low risk of breaking anything for anyone not using QoS.
What can be done better
Better coordination (in general), I’m not awesome at that,
but I guess I had the whole picture of the service, so that’s
what I did.Better coordination with remotes: It’s hard when you have a lot
of ongoing local discussions, and very limited time to sprint,
I’m looking forward to find formulas to enhance that part.
In my opinion, the mid-cycle coding sprint was very positive, the ability to meet every day, do fast cross-reviews, and very quickly loop in specific people to specific topics was very productive.I guess remote coding sprints should be very productive too, as long as companies guarantee the ability of people to focus on the specific topic, said that, the face to face part is always very valuable.I was able to learn a lot from all the other participants on specific parts of neutron I wasn’t fully aware of, and by building a service plugin we all got the understanding of a fullstack development, from API request, to database, messaging (or not), agents and how all fits together.
Special thanks Gary Kotton for joining us the first day to understand our plan, and help us later with reviews towards merging patches on the branch.To Livnat Peer, for organizing the event within Red Hat, and making sure we prioritized everything correctly.To Doug Wiegley and Kyle Mestery for helping us with rebases from master to the feature branch to cleanup gate bugs on time.
Sometimes you write a piece of code within a context, and such context grows wider and wider, or you simple need all the pieces in one place to make sure it works.
Then, for reviewing, or to work in parallel, it makes sense to split your patch in more logical patchlets. I always need to ask google. So let’s write it down here:
Let’s assume $COMMIT is the commit you want to split (set the commit for edit with the edit action):
git rebase -i $COMMIT^
And this will leave your commit changes in the working tree, but you will be back in the previous commit.
git reset HEAD^
git add -p # the pieces of code you want to
git rebase --continue
If you were working with gerrit, make sure that only one of your patches (probably the biggest one) keeps the original change ID, so the change can still be tracked, and old comments will be available.
(image credits go to: http://www.nicartoons.com/wallpapers/?id=1)
More interesting git stuff (fixup and autosquash): http://fle.github.io/git-tip-keep-your-branch-clean-with-fixup-and-autosquash.html (thanks to Jakub Libosvar!)
Sometimes, you find yourself trying to debug a problem with SE linux, specially during software development, or packaging new software features.
I have found this with neutron agents to happen quite often, as new system interactions are developed.
Disabling selinux during development is generally a bad idea, because you’ll discover such problems later in time and under higher pressure (release deadlines).
Here we show a recipe, from Kashyap Chamarthy, to find out what rules are missing, and generate a possible SELinux policy:
Make sure selinux is enabled
sudo su -
Clear your audit log, and supposing the problem was in neutron-dhcp-agent,
At that point, report a bug so you get those policies incorporated in advance.
Give a good description of what’s blocked by the policies, and why does it need to be unblocked.
Now you can generate a policy, and install it locally:
You can generate a SELinux loadable module to move on without
disabling the whole SELinux:
cat /var/log/audit/audit.log | audit2allow -a -M neutron
And you can also install it in runtime
semodule -i neutron.pp
Restart neutron-dhcp-agent (or re-trigger the problem to make sure it’s fixed)
We found during scalability tests, that the security_group_rules_for_devices RPC, which is transmitted from neutron-server to the neutron L2 agents during port changes, grew exponentially.
So we filled a spec for juno-3, the effort leaded by shihanzhang and me can be tracked here:
I have written a test and a little -dirty- benchmark (https://review.openstack.org/#/c/115575/1/neutron/tests/unit/test_security_groups_rpc.py line 418) to check the results and make sure the new RPC actually performs better.
Here are the results:
Message size (Y) vs. number of ports (X) graph:
RPC execution time in seconds (Y) vs. number of ports (X):