Sunday, August 31, 2008

Unbreaking CVSup/amd64

When I recently upgraded my FreeBSD6/amd64 ``-STABLE'' machine, the CVSup binary on the system stopped working. CVSup would dump core consistently, shortly after connecting to the remote server. This rather unwelcome development took away my ability to keep my CVS trees upto-date; fixing the bug became top priority. The bug also turned out to be an interesting one.

The bug

Running CVSup after the upgrade to 6.3-PRERELEASE would result in a core dump shortly after connection establishment.

Program received signal SIGBUS, Bus error.
0x0000000800682d4f in fcntl () from /lib/libc.so.6
(gdb)
(gdb) disassemble fcntl
... snip ...
0x0000000800682d3f <fcntl+79>:  movaps %xmm4,0xffffffffffffffc1(%rax)
0x0000000800682d43 <fcntl+83>:  movaps %xmm3,0xffffffffffffffb1(%rax)
0x0000000800682d47 <fcntl+87>:  movaps %xmm2,0xffffffffffffffa1(%rax)
0x0000000800682d4b <fcntl+91>:  movaps %xmm1,0xffffffffffffff91(%rax)
0x0000000800682d4f <fcntl+95>:  movaps %xmm0,0xff
ffffffffffff81(%rax)
0x0000000800682d53 <fcntl+99>:  lea    0x110(%rsp),%rax
0x0000000800682d5b <fcntl+107>: movl   $0x10,0x20(%rsp)
0x0000000800682d63 <fcntl+115>: movl   $0x30,0x24(%rsp)
0x0000000800682d6b <fcntl+123>: mov    %rax,0x28(%rsp)
... snip ...

The faulting instruction was trying to save SSE registers to memory; and this was odd since there was no reason for this particular code path to be using SSE registers in the first place.

Rebuilding Modula-3 and CVSup from source did not fix the core dump, though the builds of these tools themselves completed without error. A search through the PR database revealed that other FreeBSD users had also been tripped by the bug: PR bin/124353.

A peek at the solution

Modula-3's runtime needed to be patched in the following way to fix this fault.

  • First, in $M3SRC/libs/m3core/src/unix/freebsd-4.amd64/Unix.i3, we declare the Modula-3 function Unix.fcntl() as being implemented externally by C function ufcntl().
    ... snip ...
    <*EXTERNAL "ufcntl"*> PROCEDURE fcntl (fd, request: int; arg: long): int;
    ... snip ...
    
  • Matching this declaration, an implementation of ufcntl() was provided in $M3SRC/libs/m3core/src/runtime/FBSD_AMD64/RTHeapDepC.c:
    ...
    #include <fcntl.h>
    ...
    int
    ufcntl(int fd, int cmd, long arg)
    {
           return (fcntl(fd, cmd, arg));
    }
        

On the surface, this "fix" does not seem to be doing anything. The ufcntl() entry point takes 3 arguments but it passes these down to fcntl() unchanged, and in the same order.

Yet, despite the apparent ``no op''-like nature of the change, the core dumps were gone.

Why this works

To understand why this fix works, we have to delve into the ABI; into the C calling conventions used for AMD64 code.

For normal function calls, the AMD64 calling convention passes upto 6 integer arguments in registers. Thus register %rdi would hold the first argument (fd in our case), register %rsi the second, cmd, register %rdx the third and so on. However, the C prototype for fcntl() is: int fcntl(fd, cmd, ...);, i.e., fcntl is a varargs function. Varargs functions use a different calling convention on the AMD64: register %rax is a ``hidden'' input parameter for these functions.

So, prior to the fix, the Modula-3 runtime was invoking fcntl() directly, but with registers set up for a non-varargs function call.

Now, as it turns out, in FreeBSD 6.2 and earlier, fcntl() in libc was not a C language function; rather it was implemented as an assembly language stub that invoked the SYS_fcntl system call. On the AMD64, FreeBSD's argument passing convention for system calls is close enough to the non-varargs C calling convention that the processor's registers happened to be correctly setup for a direct system call.

When fcntl() in libc was changed in FreeBSD 6-STABLE on 24 Apr 2008 to be a C function instead of a system call, things broke.

Though not obvious from just looking at the C code, the no-op like fix above works by using the C compiler to translate between the two calling conventions.

What's worrying

The relevant change to libc was in CVS/SVN HEAD for about 20 days before it was merged to -stable. CVSup is also a critical tool for the FreeBSD project. This bug was however only detected in -stable, and not in -current.

Monday, August 18, 2008

Getting online using FreeBSD and BSNL DataOne

BSNL's DataOne service (DSL) is straightforward to use in FreeBSD. The following recipe shows how to get online using PPPoE.

  • First, you need to configure your DSL modem as specified by the ISP. This procedure is modem-specific and your BSNL representative should be able to assist you here.
  • You would need the name of the ethernet interface to which your modem is attached. If you are unsure of what this is, use ifconfig(8) to find out.
  • Next, you need to add the following template text to /etc/ppp/ppp.conf:
    dataone:
     set device "PPPoE:*INTERFACE*"
     set authname "*YOUR-USERNAME*"
     set authkey "*YOUR-PASSWORD*"
     set dial
     enable dns
     add default HISADDR
    

    Replace *YOUR-USERNAME* and *YOUR-PASSWORD* with your DataOne user name and password respectively. Replace *INTERFACE* with the name of your network interface (i.e., "rl0" or "fxp0" or whatever).

  • Finally, invoke ppp(8) in the usual way:
    % ppp dataone
    ppp> dial
    ppp> ... the prompt changes as PPP negotiation proceeds ...
    PPP>
    

Thats, it! You should be online.

Saturday, August 16, 2008

Draft #5 of the indic-computing "Design Axes" article

Draft #5 of the "Design Axes for Indian Language Computing" article has been made available for review at the Indic-Computing web site.

Please do take a look and let me know your comments.