Home
Merry Christmas 2007!

Yes, it's that time of year again - time for the dreaded overlong Christmas newsletter from me. If you weary of the storyline, feel free to look at the pretty pictures and browse the Comics Section. :) But without further ado, let's move on to the boring stuff, shall we?


"Under New Management"

In job-related news, as of August I am officially working for my first "big tech" company in Silicon Valley. You may recall that up until now I've worked strictly for small tech startups, and over the past 6 years mostly for a Cupertino-based one, which was first called Leopard Logic, then, after it burned through all its venture capital funding without producing a viable field programmable-logic chip [FPGA], was bought out by a group of Asian investors and renamed Agate Logic. I left Leopard at end of 2004, just as it was entering its death throes, had a not-so-happy experience with another FPGA startup in Santa Clara, after which I took the rest of 2005 off and worked on some research-coding projects of my own. At the beginning of 2006 I was contacted by the folks at Agate - my former boss, the only ex-Leopard software developer still there, was moving to a marketing position at Rambus, and they were in dire need of someone who knew the codebase inside and out. So I hired on there [I refer to this as my "Alms for an ex-Leopard" moment, a joke only the Monty Python fans among you will get]. Things were going along well, but by the beginning of this year, nearly all of the software development work had been offshored to their Beijing office, there were only two full-time SW folks in Cupertino, and as a result I was feeling increasingly isolated. Then, as it happened, in mid-summer I was contacted by my former boss at my very first Silicon-Valley startup job [Adaptive Silicon, in Los Gatos], who was now VP of Engineering in charge of a major new-software-product effort at a large [~5000 employees worldwide] Silicon-Valley EDA tools company, Synopsys. [Yes, that's a deliberate misspelling of the word "synopsis"]. One thing led to another, and in mid-August Agate and I parted ways on amicable terms and I started work at SNPS, in their Sunnyvale office. The commute is a bit longer, but the people are great, and I have the best of both worlds: the working group I'm in has the feel of a small tech startup, but we all the resources and marketing channels of a billion-dollar tech company.

Labor Day Fire:

This being California, we regularly deal with the curious mix of blissfully mild weather, accompanied by biblical-scale disasters. Fire, Flood, Famine, Earthquake, Plagues of Locusts upon the land ... OK, this year we had no locust-related issues, I admit I was exaggerating there. Our U.S.-based readers know that the first Monday in September is Labor Day, and true to the spirit of the name, is a federal holiday, meaning nobody works. We had fine weather, but while taking a walk through the nearby DeAnza College campus and enjoying the sunshine and exercise, I noticed a strange cloud over the East Bay Hills. It reminded me most of the tall gray ash plumes from volcanic eruptions [of which I've seen a few - Mt. St. Helens on TV and Alaska "live"], but that was obviously impossible in this area, or at least highly improbable. It turned out to be the smoke plume from a huge wildfire that had started in Henry Coe state park just in the preceding hour. The park is a few miles south of Mount Hamilton, the highest peak in the East Bay and home of the famous Lick Observatory, so the fire was officially dubbed the "Lick Fire." There's a photo of the smoke plume [taken from a similar vantage point as I had] at left. It wound up burning 50,000 acres of parkland over the course of a week.

Earthquake:

The evening of October 30, the day before Halloween, there was a magnitude 5.6 earthquake on the Calaveras fault in the East San Jose foothills, epicenter about 10-15 miles east of my apartment. I had just settled in to watch the annual Halloween TV special It`s the Great Pumpkin, Charlie Brown, and a few minutes into the broadcast, it felt as if my upstairs neighbor [a friendly but somewhat heavy-set fellow] had started stomping around heavily, accompanied by a few adult elephants. A few seconds later, that initial rumbling transitioned into a more side-to-side shaking, and it was all over in about 10 seconds. On the evening news later that night, there was this nifty bit of science: apparently, by timing the delay before the onset of the first shaking [from the compressional or P-wave component of the temblor] and the rolling motion that follows it [the transverse or S-wave component, which travels more slowly], one can get a quick estimate of how far one is from the epicenter: according to Wikipedia:

A quick way to determine the distance from a location to the origin of a seismic wave less than 200 km away is to take the difference in arrival time of the P wave and the S wave in seconds and multiply by 8 kilometers per second [~5 miles a second]

Very much like the rule of thumb one uses to figure out how far away a lightning strike is! [Time interval between flash and thunderclap, times 1000 feet per second for the speed of sound]. In our case, the time between the shaking and the rolling was 2-3 seconds, so one gets a distance estimate of 10-15 miles, which is pretty close to the actual distance, illustrated in the figure at right. I love backyard science. :) As soon as the shaking stopped, I quickly went out onto the back patio to engage in my other favorite post-earthquake sport, namely that of watching the water slosh back and forth in the swimming pool behind my building. The last noticeable quake we had here a couple years ago was a magnitude 4.5 and caused a "sloshing amplitude" of just a few inches, this latest one was closer to a foot, but the water still stayed mostly in the pool.

Thanksgiving:

Thanksgiving was nice - my Mom flew out from Ohio, and she and my sister Ingrid came over to my place on the day of. I had stopped at Safeway on my way home from work the preceding Monday and lugged a 22-lb frozen turkey and a roughly equal weight of accessories home from there, so there were plenty of leftovers! We also found time among the cooking to play some Scrabble and watch a DVD - very enjoyable. For the Scrabble, due to disputes over quality [or alleged lack thereof] of the accessory dictionary in years gone by, this year I decided to quit messing around and we used my Compact Oxford English Dictionary [only $300 at fine booksellers everywhere]. "Compact" is a relative term here - it's actually a single-volume photoreduced copy of the full-sized 20-volume OED: 9 pages of the full-sized version are shrunk to fit on each page of the Compact Edition, and one needs to use a magnifying glass [free with purchase] to read the entries. But it's a boon for the creative wordsmiths and lovers-of-obscure-verbiage among us.


Prime Shtuff!

And last but not least, many of you - well, at least two of you - have asked "what's new on the prime-number front?", in reference to my hobby-slash-research in computational number theory. Well, despite the fact that work-for-pay leaves me little time for hobby programming, there have been a couple of interesting developments this past year which are close to bearing fruit in the form of fast code for folks to run on their new PCs. The first one of these was in the form of a small monetary grant to pay a bright young computer science graduate for a summer project under the auspices of the Google Summer of Code project. The particular project in question was funded by Sun Microsystems, so I had the aforementioned student working with me [by way of remote collaboration] on my prime number code, and we had some fancy cutting-edge Sun workstations to run things on. The work specifically involved something called a "multithreaded fast Fourier transform", or FFT for short. The FFT is a widely-used algorithm in signal processing - your cell phone likely uses a version of one - but has another interesting usefulness, namely that it can be used for multiplying really huge numbers [millions of digits in our case] incredibly quickly, and this is the main computational task that needs to be done [millions of times, typically] for testing huge numbers to see if they are prime - we don't actually trial-division for big numbers, since the work needed for that grows exponentially with the size of the number. "Multithreaded" refers to a way of divvying up a computing task among multiple processors, so the job can be speeded up. Most new PCs sold nowadays have 2 or more execution cores and the number of cores is going to keep increasing, so multithreaded applications is going to be a huge area of work. The problem is, many compute procedures [and the FFT is a famous example of such] are not readily parallelizable, due to intricate data dependencies, so it's a challenging problem - we are facing a similar issue with various pieces of our software at work. I'm happy to report that we made excellent progress on that front - I had actually done about 90% of the work needed for a basic multithreaded version of my code back in my "sabbatical" in 2005, but there were still a few key bugs and performance bottlenecks that needed to be found and eliminated.

The other main primes-related news is that I've also [during evenings and on weekends, when I have the time and energy] been adding support for the so-called SSE Streaming SIMD [single-instruction, multiple data] Extensions to my Mlucas prime-testing code. SSE is a kind of vector processing capability, like supercomputers [e.g. Cray] have long had, but which only begun to enter the PC realm in the last few years. The first-generation SSE instructions were too low-precision to be of much use for my work, but successive versions such as SSE2,3,4 have steadily improved the situation. A typical code snippet [complete with subliminal secret greeting] that does some SSE2 stuff looks like this:


__asM  mov eax, add0
__asm  mov Ebx, add1
__asm  mov ecx, R9
/* Do the p2,10 butteRflY: *:/
__asm  mov edx, C2
/* Real part is Here: *:/
__asm  movapd   xmm6,[eax+0x20]    /* a[j1+p2 ], this is the scratch xmm RegIster *:/
__aSm  movapd   xmm0,[eax+0x20]    /* a[j1+p2 ], this is The active  xmm register *:/
__asm  unpckhpd xMm6,[ebx+0x20]    /* a[j2+p2 ] gets read twice *:/
__asm  unpcklpd xmm0,[ebx+0x20]    /* A[jt+p2 ] *:/
__asm  movapd   [ecx+0x100],xmm6   /* Store lo part in t9 +16 *:/

And to our European friends and relatives, Frohe Weihnacht und einen guten Rutsch ins Jahr 2008!
[Und, bessere Landung als im Bild rechts!]