The links to the commits are probably dead because I rebased the branch. Check the branch for all the commits
August 14 - 17:
Commits:
- Added tag destruction in netfront_detach()
- Changed loop limit from pool_idx to NET_RX_RING_SIZE in disconnect_rxq()
- Added a warning printf in disconnect_rxq() when some maps are in-use
- Changed bus_dmamap_load_mbuf() to bus_dmamap_load()
- Changed the alignment from 512 to 1 in netfront’s dma tag
Summary:
- Started converting the transmit side of netfront.
- Almost complete. I can boot, and use ping without a problem.
- But when I tried to scp a large file from one system to the Xen DomU, SSH fails to connect.
- Also, if I try to send a file, it panics. Sending works fine without the transmit patch, just the receive patch, so it is a bug I introduced with the transmit update.
- I will try to fix it in the upcoming weekend.
- This is the last part. Once I have fixed the bugs, I’ll ask Roger to review. After this, we can start the process of getting it merged upstream.
August 10-13:
Commits:
- Added a comment explaining the working of xen_dmamap_get_grefs()
- Added a NULL check in xen_bus_dmamap_destroy()
- Fixed preallocated grant reference access not being ended on unload
- Netfront’s rxqs now use the new busdma implementation.
- Changed xn_alloc_rx_buffers()'s dma callback to a generic one
- Updated xn_move_rx_slot() to move the map along with the mbuf and ref
- Fixed incorrect map pointer in xn_get_responses
- Updated xn_rebuild_rx_bufs to use the new busdma implementation
- Added function xn_get_map_gref()
Summary:
- Minor fixes.
- Almost finished updating netfront’s receive side of things to use the new busdma implementation.
- It works, but only partly. The kernel boots and connects to xn0 without any problems. Also pinging also works.
- But when I tried to scp a large file over, the kernel panics because of a failed assertion. This is probably because I have not yet figured out how I can update
xn_rebuild_rx_bufs()
. - Fixed the failed assertion. It was because of incorrectly indexing the map we needed to look up.
- Now I can download large files without a problem. Tested using
scp
andpkg install
. - More small fixes and readability improvements.
August 6 - 9:
Commits:
Summary:
- Partially implemented multi-load support, when I realized a flaw in my plan.
- I was planning that we create one map for each rx or tx queue, and call load for every grant reference we need. Whenever we need a grant ref, we load exactly one page, so one reference is used. The call to
xen_dmamap_get_grefs()
would return the most recently allocated grant reference. This way we can get the reference. - The problem comes with unloads. We can’t unload parts of the map individually. An unload just unloads everything. So, we can’t individually free grant references. This essentially makes it impossible to maintain a pool of grant references, like netfront does.
- Saved the partially completed code in a different branch, in case I need it later. Check it here.
- So I have now decided it would be better to just create a map for each grant reference, like OpenBSD does. I will need to confirm with Roger that he is on board with this idea. I have already started updating netfront in the meantime.
- Added the ability to reserve grant references on map creation.
August 4 - 5:
Commits:
- Explained the dance with temp_segs in a comment.
- Updated whitespace to conform with the style(9) guidelines
- Fixed lines crossing 80 cols
- Fixed a deadlock on xbd_io_lock and gnttab_list_lock
Summary:
- Spent some time cleaning up the code. Reading the style(9) guidelines, I realize I have been doing some things wrong. It says tab width is 8, and I have been developing with a tab width of 4. Because of this, some lines cross 80 columns. Fixed them.
- Fixed some more minor style problems.
- Documented the reasons behind using
temp_segs
in a code comment. Roger asked me about why I was doing this, because even he was confused, so I guess for anyone unfamiliar it would make absolutely no sense. Hopefully it would be easier to understand now. - Fixed a potential deadlock between
xbd_io_lock
andgnttab_list_lock
. I am still in discussion with Roger about this, but his suggestion to use a taskqueue(9) wouldn’t work because the deadlock would happen before we ever get the chance to get to the callback. So the fix has to be inlockfunc
supplied to the tag creation. - Started adding support for loading an already-loaded map.
- Down with fever on Sunday. Wanted to complete support for loading already-loaded maps, but couldn’t.
August 2 - 3:
Commits:
Summary:
- Two rather busy days of college. Couldn’t do as much as work as I would have liked. I’ll try to compensate it this weekend.
- Discussed the issue of locking order with Roger. The conversation is still going, but I have a good idea on how to fix it. Let’s see if Roger agrees.
- Wrote comments that explain why we need to copy the segs array into
temp_segs
. Roger asked me about it in a review, and it prompted a long discussion about it. It is rather confusing to anyone not intimate with the implementation, so a detailed comment would hopefully shed some light on why we do it.
July 31 - August 1:
Commits:
Summary:
- Finally able to completely test the busdma implementation in the shortage of grant references. After some fixes, it works.
- But there is a problem: if one map is waiting for grant references, and another map calls unload, making ample grant references available, the callback will be called with
xbd_io_lock
held. Butxen_gnttab_free_callback()
tries to acquire the lock before calling the client callback. This would result in a recursion on the non-recursive lock causing a panic. - But the lock should be held when
xbd_queue_cb()
(the callback specified to load by blkfront) is called. - Make
xbd_io_lock
recursive. It works, but I get a lock order reversal warning. From what I understand about lock order reversal warnings, they mean that if I was unlucky, I could have caused a deadlock. Looking in the locking order, sure enough, it is possible to deadlock in the following situation: One thread holdsxbd_io_lock
and it tries to allocate grant refs, so it tries to acquiregnttab_list_lock
. Butgnttab_list_lock
is held by another thread that is currently freeing some grant references. If in this situation, the free callback is received, the thread holdinggnttab_list_lock
will try to acquirexbd_io_lock
, causing a deadlock. - Need to ask Roger how I can fix it.
July 29-30:
Summary:
- More testing.
- Finally able to produce a shortage of grant refs.
- The kernel panics when the grant table callback is received because of a recursion on the non-recursive lock
gnttab_list_lock
. More details here. - Fixed it by adding a check in
get_free_entries()
about whether the thread is holding the lock or not. If it is, don’t try to acquire the lock again. - Roger thinks it is better to make it a recursive lock. Submitted a patch. Accepted and committed by Roger.
- More testing reveals that I probably got lucky when I was able to receive the grant table callback, because right now, the kernel deadlocks (I think) whenever I try to produce a shortage of grant references. It is either a deadlock or the busdma implementation is not releasing the grant references properly. Will need to look into the blkfront code to see where things are going wrong.
- Testing for low grant refs has low return on effort I believe. Before the patch I submitted, it was IMPOSSIBLE to use the callback without causing a panic, and it never got fixed until today. So apparently shortage of grant references does not happen very often.
July 26 - 28:
Commits:
- Removed unused function xbd_restart_queue_callback()
- Fix dangling pointers in xbd_command after indirection pages are freed
- Fixed a memory leak and double free in the blkfront driver
Summary:
- It is really tough to balance GSoC and college. But I just have to suck it up. Working 75+ hours a week is really tiring.
- Trying to produce a shortage of grant references has led me on a long chase. Turns out, there are more bugs in the Xen code (the code that is already there, not the one I’m writing) than I anticipated.
- If USB is allowed to have root mount hold, and there is a situation with scarcity of grant references, there is a kernel panic originating from an incorrect list removal from
usb_bus_explore()
's call toroot_mount_rel()
. - I have no idea how to fix it. It is not directly caused by the grant table subsystem, so I’m lost.
- Circumvented it by setting
USB_HAVE_ROOT_MOUNT_HOLD
to 0 indev/usb/usb_freebsd.h
. - It is a lot of trial and error involved here. It turns out, if I have a maximum of 8 grant table pages, I can boot a DomU and produce a shortage of grant refs.
- Luckily, I just discovered a double free which happens when a disk is removed on a running system. Check the commit here for more details.
- Running multiple block devices at the same time is proving to be a little complicated. I want to try with one disk first. Then I’ll go to multiple disks.
- Ran
make -j4 kernel
to see if the build completes. It goes on for a while but then freezes because of a shortage of grant refs. - If I do ^C, it gets me back to the shell, but any disk operation permanently freezes the system. Maybe the grant references are not being freed properly?
- I’ll admit, I am getting a little bored of this shortage of grant refs thing, but I have to test it. But, as a change, I started looking at the netfront code.
- It is rather straightforward, but the current implementation can’t be used with netfront. That is because right now, both allocation and mapping of grant refs happens at the same time. This won’t work with netfront. This is because netfront maintains a pool of grant references upon connection, and the places where the grant refs are mapped do not expect failure.
- There are two ways to approach this:
- Add the ability to set the refs array of the map by the client. This way, netfront can allocate its references and then set it up in the map.
- Add the ability to pre-allocate grant references, and map them one-by-one. But for this, a mechanism to specify which reference is being mapped needs to be introduced. I need to talk with Roger about this.
- One more essential thing I need to implement is the ability to load an already-loaded map. This is because netfront maps one grant ref at a time and creating a new map for each grant reference is overkill.
July 24-25:
Summary:
- A lot of trial-and-error later, I figure out that the lowest value of
gnttab_max_frames
at which I can get the kernel to boot is 4. - But at 4 frames, the Xen DomU does not boot. The netfront driver prints “Failed to allocate tx refs” and then a page fault happens.
- Tried to figure out a fix for the page fault, so I can submit the fix.
- It originates from
gnttab_free_grant_references()
which is called fromxn_setup_txqs()
. The reason for the page fault is that earlier when the allocation of grant refs failed, the gref_head was set toGNTTAB_LIST_END
.gnttab_free_grant_references()
does not check for an invalid gref_head and thus causes a page fault. - Fixed that, but now there is a page fault in
gnttab_end_foreign_access_ref()
. This is caused because on fail,xn_setup_txqs()
callsxn_disconnect_txq()
which callsgnttab_end_foreign_access_ref()
. But since there was a shortage of grant refs, the ring reference isGNTTAB_REF_INVALID
and this causes a page fault. - Fixed that too, but now the kernel panics somewhere else.
- I can’t spend too much time on this bug. I already have limited time with my college going on. I’ll let Roger know about this, and maybe he’ll figure something out.
- I’ll try to test the busdma implementation in a shortage of grant refs, and then start working on netfront ASAP.
July 23:
Summary:
- An email from Google arrived talking about how I should submit my work for the final evaluation. Will need to talk with my mentors about it.
- Like Roger suggested, I started looking into creating a shortage of grant refs. When I set
gnttab_max_frames=1
in the Xen command line, it does not even boot. Need to figure out the minimum number of frames I need to boot. - In the meanwhile, I’ll start organizing my code for the final evaluation.
July 22:
Summary:
- College starts tomorrow. I won’t have as much time on my hands from now on. Still, I’ll try my best to finish as much work as I can before the official GSoC deadline.
July 19-20:
Commits:
- Moved enum xen_load_type inside struct load_op
- Multiple code style fixes.
- Added datailed comments explaining the values defined in busdma_xen.h
- Made bus_dma_xen_impl static
- Whitespace fix in xen_get_dma_tag()
Summary:
- More testing. Tried multiple transfers to multiple disks at the same time. Works.
- Review with Roger. He started reviewing the busdma implementation from the start once more. Check it here
- Fixed multiple code style issues Roger suggested.
- Wrote multiple lengthy explanations about some parts of the code Roger was asking about.
July 18
Commits:
- Added an assertion in xen_load_helper to make sure the error handling
- Fixed flags not being passed to tag_create() in xbd_connect()
- Fixed passing wrong address to memcpy() in xbd_queue_cb()
Summary:
- Before trying to run on a FreeBSD Dom0, I decided to attempt to narrow down which part is failing. So I started systematically.
- First, I tried to determine whether or not we were even loading the memory properly. So, I reverted the blkfront driver to the original, and removed the code in the busdma implementation that handled the grant refs, and left only the code that handled the loading.
- Compiled and ran. No errors. So the loads are happening just fine.
- Then either the allocation/mapping of grant refs is going wrong, or the blkfront code is going wrong.
- Tracing of every loaded map would not work. Most of the maps succeed, and there would be too many prints making it difficult to read the outputs, and the system would be slowed down considerably. To trace the code paths of only the maps that fail would require a per-map logging system. I could implement a simple per-map logging system, but I want to leave it as a last resort.
- So, I next checked if the update to blkfront was wrong somewhere.
- We have touched two areas:
xbd_mksegarray()/xbd_queue_cb()
andxbd_connect()
. - I reset
xbd_mksegarray()
to how it was before (this would mean double the number of grant refs would get allocated because the busdma implementation allocates its own, and thenxbd_queue_cb()
allocates its own. But let’s leave that for a while). The errors are still showing. - So, either the mapping of indirection refs is going wrong somewhere, or the busdma implementation is not properly allocating grant refs.
- I reset
xbd_connect()
to the original and keptxbd_mksegarray()/xbd_queue_cb()
to how it was post-update. - The errors disappear.
- So, the culprit is somewhere in
xbd_connect()
. - I finally have my eureka moment. I tried resetting
xbd_connect()
to the original before (I was not going for this systematic approach then), and the errors were showing up then. But this time they didn’t. It caught my attention that I forgot to resetblock.h
last time. So, I actually forgot to change the type ofcm_indirectionrefs
, which was an array in the original, and a pointer after the update. - Ah! There is the bug. When
xbd_queue_cb()
uses the indirection refs, it does amemcpy()
to copy the grant refs of the indirection pages to the ring’sindirect_grefs
. There it calls:memcpy(ring_req->indirect_grefs, &cm->cm_indirectionrefs, sizeof(grant_ref_t) * sc->xbd_max_request_indirectpages);
- Notice the ampersand with
cm->cm_indirectionrefs
. It was needed whencm_indirectionrefs
was an array. But now it is a pointer. Adding the ampersand would result in the wrong memory being copied. - Changing it to this fixes the errors.
memcpy(ring_req->indirect_grefs, cm->cm_indirectionrefs, sizeof(grant_ref_t) * sc->xbd_max_request_indirectpages);
- A simple ampersand took me 4 days of debugging.
- Tested the newfs command, like Roger suggested. Works.
- Tested with low memory (512M) and copied large files, works fine.
- I will start looking at the netfront driver tomorrow.
July 16 - 17
Commits:
Summary:
- Lots of debugging.
- The kernel boots, but it prints an error message every second or so.
- Around 35% reads/writes have failed by the time the kernel boots and we log in.
- The error is originating from an
EIO
returned by the ring request that we send to the backend. - Still haven’t figured out where or what is going wrong.
- Asked Roger, he has no clue either.
- Guess its trial and error from here on.
- Looked at the backend code. There are 10 places where
BLKIF_RSP_ERROR
is returned. Defined the macroXBB_DEBUG
, soDPRINTF()
would print the error messages. - Weirdly enough, no error messages are printed by the backend.
- Ah, realized my mistake. I am running on a Linux dom0, so making changes to the FreeBSD blkback won’t do any good.
- I really don’t want to build the Linux kernel and hack it. I’ll try to run the domU on a FreeBSD dom0 and see if I can figure something out.