Thursday, January 7, 2016

Meet Alexa, the Amazon Echo - New Features for Home Automation

In early January 2015, I received the Amazon Echo. I'm always curious about new consumer electronics devices, and this looked to be something pretty different.  It purported to be a music player (competing with Sonos and many others, it seemed) but also announced voice recognition, natural language understanding, and interactive question answering capabilities that put it in a different category.

When the device did arrive, I was pleasantly surprised.  In particular, it's a good quality single-channel bookshelf speaker: it can pair with a bluetooth device and act as a speaker for your phone or iPad as well as play music from your personal music collection or Amazon Prime Music or several other online music catalogs.  But more importantly, it has a wicked-good array of microphones that pick up your voice commands after saying the hotword "Alexa" or "Amazon".  I'd been trying lots of microphone solutions to integrate voice into my home automation system, and back in 2010 gave up on open-air speech solutions, relenting after configuring Skype as a whole house microphone which requires speaking into a phone or iPod Touch.  The Echo microphones coupled with their voice recognition engine works beautifully from across the room even with some significant background noise.  Off the shelf, you can ask it about the weather, sports scores, to play music or a specific song, and lots more.  Everything you say shows up in a companion app on your phone that also lets you interact with some of the features via a classical small-screen user-interface.

Jump ahead a couple months and Amazon released the Echo SDK which made it possible to do integrations as extensions to the grammar Amazon provided.  Even the earliest versions of the SDK were sufficiently solid that I was able to code an integration into Homeseer/Rover in a short afternoon, so now we can say, e.g., "Alexa, ask house to turn fireplace lights on" and exactly that happens.  It understands various devices, events, and scenes given the set of sample utterances I auto-generate from metadata of my home control software.  Sidenote: I use AWS Lambda as the execution platform for the code -- I love not needing an always-running server to handle this computationally-simple and infrequently executed actions.

But lots is missing...

The Amazon Echo is already a compelling gadget, and because it's off to a great start, I realized that it could be an important component in the whole-house automation system I'm working on for a new home we're building now.   Here's my list of improvements to the hardware and software for the Echo that will make the Echo's successor be able to serve among the underpinnings of any smart home.

Whole House Audio Support

The Echo could take a play from the playbook of Sonos and enable two Echos to work as a stereo pair, or perhaps pair with a speaker-only companion that can be the second speaker in a pair.  For lots of folks, though, what would be better is simply enabling digital-audio output from the single Echo itself.  The on-board mono speaker is great for talking back to the user, but for any at-length music listening experience, you really want to use an amp or receiver with your preferred speakers.  Ideally the digital-audio would be a coax out (not optical) since the coax digital plays nicer with inexpensive baluns (gadgets on both ends that let you use the copper on a CAT5/6 cable to transmit signals other than ethernet) for a centrally-wired whole house audio system.  The Echo should still turned down the music when it hears the hotword since that's an essential feature to being able to control the music after its started.

In addition to wired digital audio-out, I'd love to see the Echo pair with other bluetooth devices as a music player (as opposed to as a speaker).  I.e., to support the music being played by the Echo to be transmitted via Bluetooth (ideally with the aptX low-latency codec) to a bluetooth receiver connected to your preferred speaker system.

An alternative to having Echo drive the music itself is to integrate Echo as a controller for SqueezeBox or Sonos music systems.  Those systems are already in place in many houses driving speakers as desired, but don't have good voice integration.  Asking "Alexa, play One by U2" (a tough sentence to parse for sure) should queue up that song on the SqueezePlayer serving the same room as the Alexa.

Echo App Improvements

The companion Echo app on the phone/tablet also needs to be improved to be competitive with the other music apps out there.  In particular, it needs to 1) start up in less than 1.5 seconds -- right now on a LG G4, I wait 10 seconds or more to do anything with the app; 2) integrate with Android Wear for pausing, volume changing, and confirmation of what you just said coupled to Undo functionality; 3) support multiple Echos including switching which one you're controlling.

I'd also like to see the Echo app have a display mode where it's reporting about what's happening on the echo in that room.  This mode would be useful for mounted tablets near each Echo for when voice control isn't sufficient or you just want to know what song is playing, pause it, change the volume, or whatever.

Even better is for the Echo to be able to pair to a tablet as its display partner to support multi-modal interfaces rather than just having the tablet (or phone) reporting its status.  For example, the voice command "Alexa, buy tickets to The Force Awakens" is best served by a continued interaction on a screen rather than reading off a list of possible venues, show times, and viewing options.  If a screen isn't available, the more tedious voice interface could continue, but by using a touch screen in collaboration with voice commands, the interaction becomes much more natural allowing a click or a response like "The first one looks good" to finalize the purchase.

Multiple Echo Support

One can easily imagine having an Echo in each room of the house, enabling music to follow you around the house (based on Bluetooth beacons or another signal identifying your location).  An important part of this is enabling micro-location awareness of each of the Echos so that they know how they relate to other devices you wish to control.  For example, in my bedroom saying "Ask house to turn the lights on" should have a different affect than in the living room: unqualified device descriptions need to use the context of the room to disambiguate.  (And fully-qualified device names should work anywhere: "Ask house to turn the master bedroom ceiling lights off" should work from anywhere in the house.)

Person Identification

Related to multiple Echos and having per-room context is having per-person context.  Minimally, certain functionality should be able to be limited to certain speakers (perhaps with an override code word when the voice is a close match but not close enough?)  For example, "Alexa, disarm the security system" should do just that but only if a recognized voice issues the command (and perhaps only if a camera near the entrance also confirms facial identification of a household member).

Voice Notifications

Another useful feature is enabling push voice notifications to an Echo.  If my garage door is open for more than 15 minutes, I have tablets in my house that are running my automation control software report "Garage door is open!"  Ideally such notifications could be pushed to any/all of the Echos in a house, and have each local Echo support do-not-disturb functionality to block or delay those notifications from a specific room.

More Hotwords

Having just two hotwords is somewhat limiting (especially since my daughter's name is Alexis -- we can't use Alexa as the hotword, and the word Amazon comes up in our daily conversation too often). It would also be great if some hotwords could tie directly into a skill so instead of saying "Alexa, ask house to turn lights off" one could say "House, turn lights off" where "House" replaces "Alexa, ask house".  Probably having two or three such skill-tied hotwords would make a significant practical difference for high-use skills.  Note also that this approach sidesteps the more complex goal of extending the core grammar with developer-defined utterances -- that seems challenging to do in a scalable way across multiple third-party providers of skills.

Better Control of other Devices

There are already a handful of integrations of the Echo with other device ecosystems (e.g., with SmartThingswith IFTTT), but I'd love to see the Echo just have better support natively for various IP-controllable devices like TiVos, HDTVs, etc.  SimpleControl (formerly Roomie Remote) does a fantastic job with a massive library of controllable devices (including support for IR blasters for older components), and Echo should be able to control the same class of gadgets that don't require new antennas, protocols, or additional hardware support.  If they really want to play in the smart home hub space, adding ZWave (my favorite right now), Zigbee, Weave, Bluetooth mesh, etc., support would be pretty useful.


I'm hopeful that the next generation Echo will have some set of these features, and certainly others from the clearly insightful forward-thinking team that came up with v1.  I can't wait!

Monday, December 16, 2013

The next chapter: Prepared Mind Innovations

A little over two months ago, in early October, I finished my last day working at Facebook. Although that last Friday working lasted less than 10 hours, it had been almost a year in planning to ensure a smooth transition for the teams I worked with, and I'm grateful to the new leaders who didn't miss a beat as we shifted responsibilities around.  I wanted to stop working full time in part to be really involved as we design and build a new house, but I can't stop working altogether. At Facebook, we have posters in every building asking (among other things):

"What would you do if you weren't afraid?"

Prepared Mind Innovations is my answer to that: an advisory and innovation company for me to share experience, empathy, wisdom, and technical chops from my over 25 years of professional experience. I'm happy to report that I'm working with some great companies already (see the Prepared Mind Innovations page for details).

Also note that, separate from my company, as an individual, I'm doing a good bit more angel investing. I've done over ten investments now, and am still adding to that set of exciting startups.  I'm looking for missions that will have a positive impact on the world, not just make a buck. (But I do believe that it's helpful to make a buck to fund a mission on the world, so I care about that, too!)

Happy holidays!

Monday, January 21, 2013

My first three months with the Tesla Model S

Delivery of my Tesla Model S

(Want $1,000 off your new Tesla?  Use this code which expires October 31, 2015.)

In early 2010, I started looking for a new car to replace my beloved 2006 Infiniti M35x.  I love technology and I love gadgets, and the M in 2006 was well ahead of its time with adaptive cruise control, steerable headlights, lane-departure warning, voice recognition, Bluetooth integration, and more.  I looked for a couple years for a car that I thought would be a worthy successor to the M, but no luck.  The closest candidate I debated was when the M itself got redesigned in 2011 as the M37, but even that seemed like too little of a tech upgrade for the 6 years that had passed since I first drove the M in 2005.

Finally in September 2011, I decided I'd place a bet on the newly announced Model S sedan from Tesla, the Silicon Valley auto maker that previously developed the Roadster.  Because I was anxious to get into a new ride, I opted for the signature series, so I could be among the first 1,000 customers to take delivery, and I put down a biggish deposit to get in line.

Thirteen months later, my Tesla Model S Signature 85 was delivered direct to my house by an energetic customer representative and a rep-in-training shadowing her.  Perhaps one of the greatest features of the car is the initial ownership experience: no haggling with car salespeople, but just configuring it online and intelligent people responding in a timely fashion to my every inquiry, all via email.  Then it shows up at your door, with an in-person instruction manual.... nice!

That was October 17th, and now, three months later, these are my thoughts....

(The firmware version on which most of this is based is 1.15.x.)

Driving the Model S

Okay, so I admit, while my M was fun to drive, the tech features were really what drew me to the Infiniti almost 8 years ago.  The promise of great technology also drew me to the Model S. However, what's most remarkable and actually the biggest surprise to me is...

The Model S is just unbelievably fun to drive!

I can't stress that enough... it's whisper-quiet and the acceleration is smooth and immediate, no matter whether you're starting from a stand-still or already going 40 and wanting to hit 65 in a hurry, it's simply awesome!  From a stoplight, "green" now means seeing the car that was behind you disappear into a tiny spot faster than I could've imagined.

Handling is really good too; the massive and heavy battery pack (total vehicle weight is over 4,500 lbs) sits very low on the car, so I can't detect any body rolling when cornering.   I do miss the AWD of my Infiniti, but I probably will be driving my wife's Infiniti JX up to the mountains, anyway.  Perhaps most importantly, I miss the adaptive/laser/smart cruise control feature that my M had been in 2006.  That's a real step backwards and hopefully something that future Teslas will get in a hurry, though I'm told the car lacks the forward radar or laser sensors so this won't be a feature added via a software upgrade. 

Another feature of my M that I miss on the Model S is the steerable headlights that point "around" corners.  The Model S does have cornering lights on both sides that turn on when you've turned the steering wheel more than a certain amount in that direction, but it's not nearly as useful on on ramps and off ramps... it seems to be more intended for making slow-moving turns in urban areas, and it works great for that.

The Physical

Simply put, the car's exterior is beautiful and shapely.  I won't spend a lot of time discussing this because it's really table stakes for a luxury sedan, but Tesla definitely succeeded in making the car attractive!  The interior is very nice -- it's rather minimalist, but I very much like the styling of the glossy Obeche wood; the one negative is that the glossy finish does reflect a bunch of sunlight, but I still prefer it over the matte finish which, to my eye, looks worn and dull.

The Frunk

The frunk (front trunk) is enormous and cool, but it's tedious to close: I was warned by my delivery specialist to use two hands to close it, and instructed on the exact location of where to put each hand to avoid bending the super-light aluminum hood.  The frunk is definitely not going to be your everyday go-to location for storage.  At least not until it has a motor to pull itself shut like the hatchback does. It does have the advantage of being completed closed and concealed (compared to the back door's storage area which is accessible to anyone who gets inside the car).

Interior Controls

There are no interior physical controls for the sunroof nor for the locks.  The lack of controls for the door locks is less big of a deal than I thought it would be since the car has some good smarts about auto-locking as you start driving, auto-unlocking when you shift back to park, and auto-locking again as the key fob gets a dozen or so feet away from the vehicle, so modulo safety concerns (see more on that below) the locking at least isn't a convenience issue.

The lack of dedicated interior control of the sunroof, on the other hand, is a headache.  It often is three clicks on the touchscreen to open the sunroof, and takes many moments looking at the screen (instead of the road) to make that happen.  It's tedious, and I find myself using the awesomely-large sunroof less than I did in my M because it's so hard.  (It sounds like the upcoming 4.x firmware updates allow you to control the sunroof using one of the dials on the steering column; that's a pretty unusual location for a physical control for the sunroof, but in California it might be okay.  I'd hate to accidentally open the sunroof in Detroit during a winter snowstorm!)

Safety Concerns

The proximity key fob generally works great on the side doors as well as the hatchback.  It has one quirky failing, though.  If the key fob is near the driver's side, the passenger side door can be unlocked and opened. That seems like a misfeature... it's already hard enough to lock your doors immediately upon getting in the car, and even more dangerous to have someone hop in the car on the passenger side as you approach the car to get in because it makes no distinction between the two sides of the vehicle.

The Model S does, at least, have a physical hazard-lights button and an unusually-placed-on-the-gear-column emergency brake.  That was thoughtful and good!  It also has a standard cluster of controls on the driver's side armrest for the windows, the side-view (electrochromic dimming) mirrors, but no lock controls:

Contrast the above driver's side with how bare the physical controls are for the other seating locations with just a window control (again, no lock controls):

Miscellany on the Interior

The Model S chose to use a non-standard connector for charging, and that's fine when charging at home.  However, I'm one of the fortunate Silicon Valley folks whose employer has free electric vehicle charging at work, and that means I'm using the adapter all the time.  A useful addition would be a permanent and easily accessible spot for the adapter to attach, perhaps a storage location on the inside of the left door.  Right now, I put it in the smallish glovebox, and use velcro to attach my EV charging cards to the door of the glovebox.  I've also written my email address in silver-colored permanent ink on the adapter (the part that is hidden in the vehicle while charging) in case I ever forget to detach the adapter when returning a public charger to its base; the EV community is fantastically supportive, so I have no doubt it'll get returned to me should I ever be careless!

By the way, that smallish glovebox is really the only concealed storage accessible from the driver's seat.  Tesla talks about different configurations of convenience consoles, but I've not heard much more about them since delivery. Right now, I use a velcro-attached zippable squared-off bag to put my phone and other essentials into; otherwise they'd just slide all around in the very open center area.  There's only very short 1.5" walls preventing these objects from sliding under the accelerator (not gas!) pedal and brake; that would be bad while driving!

A bunch of other little nits about the interior include:
  • The rear window really is just a slit -- this it the price you pay for the amazing aerodynamics of this car.
  • The sunshades are really small and don't have lit mirrors.  They really couldn't be much bigger, though, because of the sleek front styling.
  • The heated seats are great -- they warm up fast and it sounds like that's a recommended lower-energy way of warming up (rather than turning on the climate control system that uses fans and has to blow air around).  The heaters should probably turn off automatically if there's no weight detected in the seat, though.  I do miss my ventilated seats from my M, though... cooled seats sounds like an extraneous luxury, but not on hot California summer days!
  • There are rails on which the front seats move forward and backward via nice power controls.  The fronts of the rails -- the part closest to your leg -- are really sharp... my wife scraped some skin off of her leg getting in the first time.  Most cars seem to have some kind of plastic cover on the end cap of these rails, and that'd be nice to add.
  • The hatchback is supposed to have a privacy cover -- at delivery time they told me they were having supply issues on those... three months later, I'm still waiting for that. Last I've heard, I should be getting mine in February 2013 or so.
  • There are no rear-seat ceiling grips... they'd be useful to backseat passengers during cornering demos :-), but also I used to use them on my M for hanging dry-cleaning.
None of these things are terrible, they're all a little disappointing for the most expensive car I've ever owned, but probably perfectly reasonable at the entry-level price-point for the basic model.

Miscellany on the Exterior

The retractable handles are the most talked-about unique feature of the exterior.  They're super cool, and will be even cooler when they come out as you and your keyfob approach the car.  In the 1.15 firmware,  you have to push the handle a bit before it "wakes up" and comes out so you can then open the door.  This is a little tedious at first, but soon becomes second nature.  A surprise issue with the retractable handles is after a car wash or rain: they stay wet for quite a while since they're not exposed to the sun or wind to dry off.

One other really nice and unexpected touch:  the Tesla-supplied power cable has a button on it to trigger the opening of the charge port to plug-in.  That wows friends at least as much as the handles.

The Technology

Drool... the 17" touchscreen.

This is always the first thing my techie friends ask to see in the car.  And it's nice.  It does a good job of not reflecting sunlight while also not being obviously polarized in a way that can conflict with my sunglasses's polarization.  It is multi-touch, and within the web browser and the navigation apps, the usual gestures for zooming in and out and panning around work as you'd anticipate, though they're more sluggish than even a first-generation Apple iPad (but those aren't 17", either :-) ). So far I have not noticed any web apps customizing their behaviour for the Tesla (the Qt-based browser reports "Mozilla/5.0 (X11; U; Linux; C) AppleWebKit/533.3 (KHTML, like Gecko) QtCarBrowser Safari/533.3" as its User Agent). I'll say more about the apps and the user interface of this large touchscreen in a bit.

There's also a smaller non-touch screen directly in front of the steering wheel.  The center of that shows things like the spedometer, the odometer, the cruise-control settings, etc.  On each of the right third and the left third of this smaller display, you can pick one of set of apps to show.  The left-third can also show a Garmin perspective-view navigation display that's loosely sync'd to the map view on the main display.  More about the navigation in a bit.

Left (non-touch) Screen directly in front of driver

The system overall depends on a data connection.  The current version uses a 3G connection, though folks at Tesla I've spoken to have been cagey about whether there's actually a 4G radio in the car or not.  From the number of bars shown onscreen, it seems that they're using AT&T to provide the data package, and it's free, I'm told, for the first year at least. Tesla does report that there is a WiFi radio and they expect to turn on WiFi functionality in a later firmware update.  This will probably be especially nice for folks who don't have good AT&T coverage in their garage.  Ideally, the apps will get smarter about downloading and caching more when on a better connection.  Right now the Slacker Internet radio and the navigation tiles fail when data connectivity is bad, and that is pretty terrible compared to other cars' navigation systems that hold data for the whole country (or at least a region) on a spinning hard drive or DVD.

Speaking of DVDs, it's worth noting that there's no optical disc capability in the Model S whatsoever.  No CDs, no DVDs, no BluRay.  There are two USB ports that are easily accessible in the all-too-open center console, and one 12V power port (into which I plugged a charge-only USB adapter).  No video inputs, either, and it doesn't look like the car support MirrorLink yet or any other technology to use the 17" screen to show content from your phone or other devices.  Bluetooth or the USB ports let you play music from your phone, including streaming from Pandora or Spotify or whatever your favorite music service is.

The Bluetooth works reasonably well, although the first firmware I had took many minutes to sync all the many hundreds of contacts from my Samsung Galaxy SIII.  It worked lots better with my iPhone 4, and today works fine with my iPhone 5 and passably well with my SGS3.

UI Framework

The basic user interaction framework for the main screen is straightforward.  There's four stacked areas top to bottom:
  1. A thin status area showing (from left to right) the current temperature, battery state (clicking opens charge controls window), the homelink control (for operating your garage door), the seating presets control, the stylized "T" Tesla logo, the Bluetooth control, data connectivity status, and the time.
  2. A dock-like horizontal row of the applications showing (from left to right) Media, Nav, Energy, Web, Camera, and Phone.  The app or apps that are currently visible onscreen are shown in reverse text (white on a black background).
  3. The largest middle section which shows the active app or two -- each of the apps can be full height (and consume all of this area) or half-height.  In the picture to the right, you see the Nav app in the top, and the Media app in the bottom, each half-height.
  4. A bottom set of fixed controls showing (from left to right) "Controls", driver-side heated seat, driver-side temperature, climate, passenger side temperature, passenger-side heated seat, and volume. 
The primary mechanism for getting around in the UI is using the dock-like horizontal bar second from the top.  Touching any of the apps that isn't currently being displayed makes that app show in the top half of the large middle section of the display (the area described by the third bullet above).  There's a couple of reconfiguration controls on those app windows, too:
  • at the bottom left of each app window, there's a small icon that toggles that app between being half-height and full-height (apps that don't support full-height mode don't have this icon -- Energy, e.g., only runs half-height).
  • at the middle of the split screen (when and only when the main display is split) at the far right, there's a small curved arrow icon that swaps the top and the bottom app windows. 
Although these controls are reasonably intuitive, there's a big usability issue lurking in this v1 design: when an app is full-height, there's no visible indication of what the "hidden" half-height app is, even though that hidden state is persistent.  So, for example consider these two scenarios:

A) From the state in the above picture, if you maximize the Nav app to full-height, the Media app won't be visible and only the Nav app will show in reverse text in the dock.  However, when you then touch, say, the Energy app, that app will take over the top half of the screen, and Media will start showing again in the bottom half.  Versus...

B) From that above picture, if you first swapped the Nav to the bottom half (so Media is on the top half) and then clicked maximize on Nav, Nav will be full screen, also with no visible cue about the Media app.  Now, though, when you touch the Energy app, that app still takes over the top half of the screen, but Nav will be showing in the bottom half.

A simple but significant improvement would require two changes:

  1. simply indicate which app, if any, is hidden from view when there is a full screen app -- probably a different rendering of the text of the app itself in the dock.  This would let the driver know what app will come into view if they un-maximize a full-height app; and
  2. when an app is full screen and another app button is touched in the dock, always put that current full-screen app in the bottom half of the screen and put the newly selected app in the top half-height.  A possible variant on this: If the new app can be full-height then it could come up full-height with the prior full-height app being switched to hidden bottom-half-height mode.
Furthermore, there's missed opportunities here.  The dock buttons do absolutely nothing if the app is already one of the two visible apps.  I'd prefer if the followed the following heuristic:

  • If the user touches an app that's already in the top spot half-height, make it full-height if possible
  • If the user touches an app that's already full-height, toggle it back to half-height in the top spot
  • If the user touches an app that's already in the bottom spot half-height, switch it to the top half-height (this rule plus the first rule mean that double-tapping an app that's half-height bottom will make it full-height)

Touchability Heatmap

Although the above aren't strictly necessary, they would be valuable additions because of one significant observation about the giant 17" display: it's hard to aim your touches when you're driving.  For one, you don't want to take your eyes off the road, but second, notwithstanding the super smooth ride, as the car jostles around, small targets are just harder to hit than, say, and iPad sitting on your lap with both hands available to interact.  To Tesla's credit, they've placed the smaller touch-targets at the easiest-to-hit parts of the screen.  After a couple months with my Model S, here's a heatmap I threw together of what I think the easiest-to-touch (reddest) parts of the screen are:

The corners are by far the easiest, especially the bottom left (where the "Controls" button is conveniently located -- you'll use this lots).  All four sides are next easiest, with the left and bottom a good bit easier than the top and right.  The bottom is easier because there's a ledge you can rest your hand on while reaching up to calibrate your touch.  Similarly, the swap-windows icon is a little easier to hit in the middle of the right edge because of some of the trim adjacent to the glovebox (misnamed for California :-) ).

Popup Overlay Windows

The final significant framework component is the modal overlay window with optional tabs (sometimes both horizontal and vertical).  For example, a big overlay window takes over nearly the entire screen when you click the bottom left "Control" button:

This large window consumes nearly the entire screen -- only the top thin status bar and the bottom comfort controls bar remain on screen.  The popup overlays have an "X" icon in the top left to dismiss them, however, it's generally easier to just click either on the background or, in the case of the Controls popup, clicking the easy-to-find-blind Controls button in the bottom left toggles the Controls popup overlay window away.

There's lots going on in the Controls window.  In particular, a whole bunch of nifty settings are hidden behind the Settings top tab. That tab selection only changes the top half of the Controls window -- the bottom half thankfully always shows the Doors & Locks and Lights section.  I'd prefer if the sunroof were always visible, too: the Controls window's configuration is persistent between uses of the popup, so sometimes it can take several clicks to get to the sunroof setting which is one that I tend to use very often.  Overall, the Controls window feels strange as a popup overlay -- I think it might be improved as an animated slide-up card -- that might also suggest the easy affordance of swiping it down to dismiss the screen.

Swipe Gestures

As a related quick aside on another area that feels like a missed opportunity: swipe gestures are conspicuously missing from the experience (excepting the Nav app that lets you drag the map around, pinch to zoom, etc.).  For example, the thermostat controls (which can be synchronized between the driver and passenger) are nicely positioned, but require repeated touches for each degree up or down you want to change the temperature.  I think it might be better to figure out how to allow drags to change the temperature in a single motion.  E.g., you could click and drag up to change the temperature up based on how long you move your finger up (ignoring the speed of the movement), and a drag down would reduce the temperature.  Since you can't drag down initially (there's no room under the button) you'd have to swipe up quickly and then drag down slowly to reduce the temperature.  There'd need to be a visual indication of the virtual temperature slider to help as user's learn the admittedly weird gesture, but in time it'd be easier to do without having to look at the screen for each of multiple touches.  (And the existing behaviour could stay, too, as it's easier to learn, but harder to do while driving.)

The Apps

Most of the apps are pretty basic, and I have some thoughts on each of them.

Energy App

This is very bare bones.  I'd love to see vertical lines that are time markers so I can correlate when things happened with how energy usage changed.  It would also be nice to have annotations explaining things like "Climate control turned off" or "0-60 in 4.7 seconds" :-).

Media App

The Media app is still very early.  Before the first firmware update I got (to v1.15) it was almost impossible to use a high-capacity USB stick... it took forever to scan the folders.  Luckily that wasn't too big of a deal because the Internet radio provided by Slacker works well as long as there's good 3G data connectivity (it skips plentifully when there is not).  One of the little details that is surprisingly welcome about the Slacker radio integration is that the current song pauses as you stop the car, and starts back right where you left off when you get back in.  I.e., it works more like your own music collection than FM or XM/Sirius radio.

Another example of a popup overlay window (like the Controls window described above) is the fader and balance control for the Media app.  It looks like this, and uses a nice and intuitive display and control of the acoustical center of the the vehicle:

In the top down view, you simply drag the center disc up to make the front speakers louder, and to the left to make the left side louder.  Basically, you drag the disc over the person whose talking you'd like to drown out! :-)

Other big issues with the Media player include:

  • No support for Slacker premium -- I can't search for artists and play them.
  • No de-duping of MP3s from a high-capacity USB stick
  • No ability to navigate a USB stick by folder and just find a specific file; if the ID3 tags are mangled or poor, this is essential.  It's also really useful for a rips of foreign-language CDs where the ID3 Artist tag used isn't easily guessable.


One unusual feature of the Camera app is that it can be turned on and stays on at any speed.  It automatically pops up when you put the car in reverse, and if it popped up automatically, it'll also automatically go away when you exit reverse.  

The big and notable missing feature for Camera is overlay lines showing the trajectory of the car given the current steering wheel position.  That feature has been on lesser vehicles for almost ten years, and should be an easy software update.  Sonar warning would also be nice, but the Model S lacks those physical sensors, so that's not coming as a software update.

Compared to other high-end cars these days, it's also missing side and front view cameras, but those won't be coming via software, either. :-(


Primary things missing here are better navigation of contacts and better integration with the phones favorite contacts list.  Car-specific favorites would also be nice.  This probably becomes less problematic with the newer firmware the supports voice-recognition-based dialing.

Web browser

The basics of the Webkit/Qt-based browser work reasonably well, though as mentioned before, it feels a little sluggish compared to even the first generation iPad.  Unfortunately, the car really needs a valet mode (see more on this below) or some other mechanism of protecting the form-fill history and cookies of the web browser or else it's useless for logging in to any personalized services such as Facebook, or Google Calendar or Gmail -- any site with login credentials saved to the browser presents a big risk.  Very welcome would be some mechanism to, e.g., PIN lock the cookies so that after each start of the car, the user has to enter a PIN the first time they need to send a cookie to a personalized site (or maybe to a list of personalized sites).

Other obvious wins that aren't yet implemented include:
  • Phone numbers on web pages should be clickable to dial using your bluetooth-connected phone
  • Addresses on web pages should be clickable to enable navigation to that location
More interesting would be platform-like integrations to enable experiences such as navigating to a Facebook event that is coming up shortly after you enter the car, text-to-speech to read emails or Facebook posts, and more.


If, in hindsight, the Tesla engineers could've made more progress on any app before launch, I bet they'd wish those improvements on the Nav app.  It's not that it's bad it's just that it's a place that the car could've really shined and still truly doesn't.  I have no doubt it'll get great, but right now it's a mixed bag.

The first thing you'll notice about the Nav app is that it's really just a Google Maps app that integrates with what appears to be Garmin navigation technology and UI.  Worse, that integration is very loose: the navigation features on the main screen are limited to drawing an overlay path on the main map and showing a small rectangle with contents that list the next couple driving instructions the Garmin-powered nav gives.  Meanwhile, the left side of the left screen (the screen in front of the driver) shows a bird's eye perspetive view of a very-different-looking map.  Perhaps worst, even though the main Google map shows traffic information, it lacks incident information (e.g., accident and construction markers) and the traffic information it shows is in no way reflected in the navigation directions or time estimates.  Fail!  (And I mean fail in a big way -- I spent three hours driving to Half Moon bay once when I could've been redirected around a major traffic backup and saved 1.5 hours.)

The Google Maps integration actually has several significant problems on its own, independent of its lack of integration with the navigation controls.  These general seem to be limitations due to a very simple use of basically the stock (older) Google maps feature set.  Among the problems are:
  • The map is always in North-Up orientation; there's no option to set the car's current compass heading up and having the map rotate appropriately to keep that orientation. 
  • The map is always in top-down view mode; there's no perspective view option
  • The street names are too small to read, though there's a somewhat wonky "zoom" setting that seems to just scale up the map by 1.5x or 2x to try to alleviate this problem.
  • The tiles don't seem to be cached at all (or certainly not enough).  This is awful: if you're in a low data connectivity area, don't even think about changing the zoom level or panning around.
  • In track-the-current-position mode, the triangle representing the car stays dead-center.  Ideally it'd show at least a bit more information in the direction you're driving (like older videogames do in 3rd-person top-down views).
  • The navigation path highlight makes it really hard to see the color of the traffic colorings underneath; I sometimes find myself turning off the current destination to get rid of that highlighting just so I can see the traffic pattern colors better.
  • Pinch to zoom in/out works really slowly and sometimes doesn't actually change the scale of the display: I hope someone is working on using the vector-graphics-based Maps instead of the tiles, as that's the long-term right fix to this.
Now it's not all bad.  The satellite view at fully-zoomed-in scale is super fun to watch (as a passenger, of course) while driving along.  Also, entering a destination is far easier than on any other car I've ever used: this seems to be integrated directly with a Google search so typing in a destination of "Stanford Hospital" actually works as you'd want it to (and if that's your destination, you'll be glad to save those few moments)!

Another feature idea: I'd love the ability to bookmark position/zoom-level of the map.  I find that for regular commutes, what I most want is to see an overview of the traffic on the bay area roads and highways between my home and work, and I'd like to get to that configuration of the map with just a touch or two.

Firmware Updates

About a week after I received the Model S, I got an over-the-air firmware update.  The car downloads the new software in the background over the 3G network, and then prompts you to schedule a time for the car to swap in the new packages.  It takes a couple hours for the download to happen, but then the installation completed in only about 25-40 minutes for me.  This is without a doubt the most important feature of the Tesla philosophy: that iterative improvements with customer feedback is the better way to innovate and advance the state of the art.

By the way, the car seems to do minor firmware updates without notice.  For example, I knowingly updated to 1.15.5, and sometime since then I got up to 1.15.14, but never took other action to update further.

Tesla does recommend you roll down a window when doing the update installation, presumably as a precaution to make sure you can open the car door by reaching in the window and using the interior physical handle.  Luckily I have a garage, so this isn't a big deal, but I imagine for some it could be more problematic (especially in the winter).

Another outcome of this continuous-software-improvement philosophy: when I took delivery, at the very start of using the system, I was taught by the super-helpful delivery specialists how to reset both the left screen and the main screen -- they reboot independently.  Significantly, neither of those subsystems that can be user-rebooted seem to contain anything critical to driving the car: I've rebooted each of them while driving, and while it's crazy dark in the interior (no speedometer, no odometer, nothing), the Model S drives just fine.

New Feature Ideas

Here's a partial list of features that I wish the Model S has and I hope Tesla is working on. Significantly, I'm trying to focus on things that I believe are possible with the current hardware.

Valet Mode

As I mentioned earlier in the web browser description, it's a bad omission that there are no security controls over the login credentials associated with websites.  Minimally, there needs to be a simple valet mode that's triggered by a couple touches and requires either a custom code or a pre-saved PIN to get back to regular driving mode.  The valet mode should (joint list with my friend Sean who I brainstormed with):
  • provide basic usage details onscreen explaining how to drive the car
  • lock out access to the web browser, prior navigation destinations and phone numbers, and bluetooth pairings to phones
  • lock out access to any USB sticks and any data usage for media, limit the volume of the stereo
  • switch to auto off lights, put the car in regular creep mode, and close the sunroof
  • lock out access to the frunk (and glovebox if possible)
  • optionally lock out access to the homelink controls
  • limit the acceleration, and always switch to the highest suspension setting the current speed allows
If you forget the pin, you'd need to call customer service to unlock the valet mode; questions about recent prior destinations and other things that valet mode locks down could be used to verify your identity, along with other account information.

Presets for Windows and Sunroof

Given how awkward it is to control the sunroof, one approach that Tesla seems to be going down is making it easier to control the sunroof.  Perhaps another more tantalizing option is making a windows-presets control in the top thin status bar.  Analogous to the seat profiles dropdown from the status bar, the window presets would let you save the current window and sunroof configuration to a preset slot and then easily go back to one of the saved settings.  This would let you, e.g., switch between all-closed and my preferred sunroof-open-all-the-way-and-back-windows-cracked-a-bit setting with just a couple of touches.  That would leapfrog the much-easier physical controls for the sunroof that other cars have by simplifying what is otherwise a multi-step process.

Caldav Syncing and Send-to-Car

Ideally the car will sync with my Facebook events and Calendar and make it easy to navigate to any of the locations in the upcoming hours, being smart about how long it takes to get somewhere so that a trip to Tahoe will be called to my attention 6 hours before I'd need to leave, while a trip down the street wouldn't prompt me to start navigation until closer to the appointment time.  I'd also like to be able to send a map address or set of phone numbers to the car to make it easier to navigate to those locations and dial those numbers.

Key fobs Associated with Driver Profiles

As far as I can tell, both of my key fobs work identically. I hope they're logically separate to the car so that a future update can associate a key fob with a driver profile that includes a seating configuration, media and phone favorites and presets, Bluetooth preferences, web browser cookies and form-fill auto-completes, etc.

Timed Charging and Power Sleep Mode

In California, we have electric-vehicle residential electrical rate cards that have much lower costs after midnight.  It's an economically-valuable feature to allow you to configure the car to only start drawing power according to a pre-specified schedule (it can be 4x cheaper charging after midnight).  I imagine this is coming, and from what I read, Tesla is having a difficult time with the super-hard problem of getting the car to go into a power-saving sleep mode.  More precisely, the challenge is getting the operating system and all the hardware to come out of that mode correctly without unintended changes to the state.   Right now the car loses about 8 miles of range per day when it's not plugged in, and Tesla's communications assure us owners that they're working on that problem at a high priority.  

Creep Mode option to Stay-Put

The S has a optional creep mode to approximate an automatic vehicle's slow creep when your foot is off the break.  Apparently, some people like this to help in maneuvering in tight spots.  With creep mode turned off, the car actually feels more like a manual-transmission car, and in particular it will roll backwards on a hill, and I don't like that.  But I also don't like creep mode since it's unnatural and is approximating a quirk of gas-powered cars.

I'd like a stay-put mode that would be almost like a "weak creep" mode.  Specifically, it'd apply enough power to not roll backward or forward on a hill.  If you want to go forward, you push the accelerator when in drive; if you want to go backward, you push the accelerator when in reverse.  Otherwise, the car stays in place.  (You'd still have to set the brake or put it in park for extended time on a hill, of course.)  My guess is that this is actually what would be most natural and intuitive to first time drivers, and stay-put mode seems perfectly doable given the sensors that must exist on the car.

Charging Fail Notice

Many public charges require telling the base station to start charging before the current flows.  It'd be useful if the car honked or chirped at you if it's plugged in but no current is flowing after, say, 20 seconds. That'd make sure you didn't forget to take the last step to start the car charging.

WiFi Access

Not only would WiFi access save on data usage and make downloading music, map tiles, firmware updates, etc., lots faster, it's also really useful for integrating with the rest of a smart home.  I'd love to be able to use devices running on my house's WiFi network without opening up holes in my firewall and risking the security issues associated with that.

Voice Memo app

I'd love a simple voice memo app that emails me or other pre-configured recipients a WAV file of something I just said.   Ideally it'd integrate with calendar, too, so that attendees of upcoming meetings would be easy to send a quick email-voice-recording to.

Access to Screenshots

An undocumented (as far as I can tell) feature of the car is that a ~5+ second press of the menu button on the steering wheel takes and saves a screenshot of both displays.  (I discovered it accidentally when playing with the steering wheel controls in park.) Unfortunately, I don't know where they go (Tesla service?); I'd love to have them emailed to me automatically or provide a way for me to download them later so I could better communicate with the service folks without having to take an indirect photo using the camera on my phone (which is impossible to do safely while driving when something weird happens -- hypothetically, of course :-) ).

iOS and Android Apps

It's no secret Tesla is working on these, but I'm a beta user of the Android App and agreed when I became a beta user not to say anything about them, so I won't.

Oh, and it's Electric

I've not written much about it here (intentionally, because I did not have a goal of buying an electric car), but obviously the Model S gets lots of press about its one central feature of being 100% electric.  I like not visiting gas stations, and in the 1,200 miles I've driven, I've averaged about 350 Wh/mile.  That's about 420 KWh total, each of which costs between $0.04 and $0.21 from PG&E at home (depending on how much other electricity needs I have) -- if I'd charged all of those miles at home, it would've cost me between $15 and $90 total for the 3+ months since October 17, 2012.   You don't have to want to be green for that to make a ton of sense!

Postscript: I just got v4.2 of the firmware last night and I'm delighted to see a bunch of changes to way the dock app buttons work, similar to how I proposed.  The voice command system is also great when it works, but it fails as often as it recognizes what I said.  I'll have an update later.

(Disclaimer: I own some shares of TSLA -- I bought them right after my test drive of the Model S, and expect to hold them for a long while.)

Saturday, June 9, 2012

SSD to Revive my PC

SSD to Revive my PC

My current primary home Windows PC is running Windows 7 on hardware from late 2009: a Quad Core Intel Core2 CPU, a variety of SATA and IDE hard drives, and 4GB of RAM (800 MHz bus).  It's still very usable after booting, but the time from power-on to having logged in and getting Chrome or Firefox open is 3 minutes and 5 seconds.  Here are a set of timings for the machine (all times are total from the point the BIOS recognizes the CPU):

To login screen visible: 40 sec
To Windows 7 desktop visible: 1min 15sec (75 sec)
To Chrome browser open: 3min 5sec (185 sec)

That's a bunch faster than it ran under Windows XP (I skipped Vista.... didn't everyone?), even with the addition of tons of drivers and startup processes that I surely only used once to play with a new gadget.  Still, after experiencing my recently bought Samsung Series 3 Chromebox (see more on that below), I thought I had to be able to do better and breathe some more life into this machine.

After doing a bit of research on consumer SSD (Solid State Drives), I found the Crucial 128 GB m4 SSD and thought I'd give that a try.  My prior boot partition was only 80GB, so I didn't need a bigger size (I use secondary drives for my data to avoid overfilling the boot partition and to make it easier to re-install/update OS).  I also ordered a StarTech 18in SATA Cable since it wasn't terribly clear whether the SSD would come with a cable.

A little more research later, I decided I should also use some disk cloning software to facilitate the transfer of the boot drive and the OS from the existing SATA drive to my new SSD. There are instructions for how to do this with only free software (e.g., ) but Paragon's Migrate OS to SSD tool seemed like a good deal for $20.

And it was... the cloning of the OS to the new drive went flawlessly (as had the installation of the SSD).  I did struggle for (far) too long with getting my complex mix of drives to allow me to consistently boot reliably off of the new SSD and learned more than I wanted to about BootRec.exe and BCDEdit.exe.  On the same night I started, though, I did get it working and was ready for the timing test.  The new numbers with the Crucial 128 GB m4 SSD were:

To login screen visible: 34 sec (down from 40 sec, ok)
To Windows 7 desktop visible: 44 sec (down from 75 sec -- wow!)
To Chrome browser open: 49 sec (down from 185 sec -- wow WOW!)

The PC now feels almost like a brand new high-end machine with a fresh OS install!

Again, though, remember that the original point of comparison was my new Samsung Series 3 Chromebox.  If all you ever want to do is get into a Chrome web browser, it's still the champ.  It goes from power-on to typing a URL into the address bar in only 7 seconds.  It took me longer than that to write this short sentence!

Tuesday, May 24, 2011

Cassowary Constraint Solver in JavaScript

As a longtime-recovering academic, I still occasionally feel the urge to make the work I did for my Ph.D. back in the '90s matter more in the real world.  Recently, I had the chance to use two different and unrelated pieces of my research together.... as any fellow technologist knows, that makes doing the work way more than twice as fun!

One of the projects I found a new use for is JavaML – work I did in 1999. For my grad school career, JavaML was perhaps most notable because it was done right before I finished my Ph.D. but was not included in my thesis at all... quite literally, I went to my advisor one week before the submission deadline and said "I'm gonna disappear for a week while I work on this cool idea, ok?" Luckily, that paper was accepted to WWW in 2000 which was hosted in Amsterdam. Yes, the choice of conference was somewhat influenced by its being hosted in the Netherlands, but also because the project idea was grounded in something I knew the WWW community would be excited about: the use of web standards applied to developer, language, and compiler tooling. Specifically, my idea was appealingly simple: convert Java source code into XML so that one could use XML tools such as XSLT, SAX parsers, DOM trees, and more, to write code transformations  and analysis and querying tools. The primary challenge was the exact design of the XML schema for the Java AST.  That piece was critical in making downstream tools powerful and easy to build.  As with all of my work, I backed the theory with an implementation that I built using IBM's then state-of-the-art Java compiler. All told, that JavaML paper turned out well, I made a cool poster for UW's industrial affiliates one year and I had a working implementation that had plenty of reasonably compelling examples. And I had a fun visit to Amsterdam... But never did I myself use the tools I built for anything practical. Until a couple weeks ago....

The other prior research I dusted off in my latest personal project is the Cassowary Constraint Solving Toolkit. That work and its applications made up the bulk of my PhD thesis. In a nutshell, Cassowary is a library and framework for expressing linear arithmetic equality and inequality relationships among variables and solving systems of those relationships incrementally and in real time. Most of my work involved applying the incremental solver to graphical applications such as the Scheme Constraints Window Manager or the Cascading Style Sheet engine in web browsers and Scalable Vector Graphics renderers, but other researchers used it for satisfiability engines and resource planning, and more. For many years, I used Cassowary daily in the Scheme Constraints Window Manager, but it never achieved widespread use on the web. In part this is because it needed to be built into the browser... although I made the Cassowary toolkit available in C++, Java, Smalltalk, Python, and Scheme, none of those languages were easy to demonstrate over the web.

A few weeks ago, I realized something both obvious and important. We all know that JavaScript implementations have improved leaps and bounds over the last decade. IE6 was a disaster to work with in large part because the JavaScript implementation was good enough only for the tiny little scripts that defined JavaScript in the 90s. Today, Chrome and its V8 engine along with similar advances in Safari  (Nitro) and Firefox (TraceMonkey) make JavaScript performance scream. That meant that I could run Cassowary natively in the browser with good performance if only I had a JavaScript implementation. (Or, more properly, ECMAScript.)

Thus, my plan was hatched: given that I wrote a Java implementation of Cassowary, I would use JavaML to serialize that code into its XML representation. Back in 2000, I wrote an XSLT-based converter to go back from JavaML's XML representation to plaintext Java source code (in fact, round-tripping between those representations was my unit test as I was building JavaML). For this project, I'd change that converter to, instead of outputting Java source code, transliterate language constructs and generate a close-to-correct JavaScript plaintext representation of the program. Then, after some editing and debugging, I'd end up with a complete JavaScript implementation of the solver enabling me and others to explore the use of constraints for layout or other functionality within rich, modern AJAX web applications. In case you're impatient, here's the bounded quadrilateral demo in JavaScript. The full JavaScript implementation of Cassowary is available via CVS at Sourceforge with the rest of the Cassowary distribution.

Let me go through that last paragraph in slow motion. First, I took my Java implementation of Cassowary which has code like this:

public String toString()
      StringBuffer bstr = new StringBuffer("[");
      for (int i = 0; i < _values.length-1; i++) {
      // ...
      return bstr.toString();

after running it through my Jikes-JavaML tool, I get JavaML XML code like this:

<method name="toString" visibility="public" id="meth-3340">
  <type name="String"/>
    <!-- StringBuffer bstr = new StringBuffer("[") -->
    <local-variable name="bstr" id="locvar-11045"> 
      <type name="StringBuffer"/>
      <new><type name="StringBuffer"/><arguments>
           <literal-string length="1">[</literal-string></arguments></new>
    <loop kind="for">
      <init> <!-- int i = 0 -->
        <local-variable name="i" id="locvar-11050">
            <type name="int" primitive="true"/>
            <literal-number kind="integer" value="0"/></local-variable>
        <binary-expr op="lt">  <!-- i < _values.length - 1 -->
          <var-ref name="i" idref="locvar-11050"/>
          <binary-expr op="-">
            <field-access field="length"><var-ref name="_values"/></field-access>
            <literal-number kind="integer" value="1"/>
      <update> <!-- i++ -->
        <unary-expr op="++" post="true">
            <var-ref name="i" idref="locvar-11050"/>
        <send message="append"> <!-- bstr.append(_values[i]) -->
          <target><var-ref name="bstr"/></target>
                <var-ref name="_values"/></base><offset>
                <var-ref name="i"/></offset></array-ref>
        <send message="append"> <!-- bstr.append(",") -->
          <target><var-ref name="bstr"/></target>
          <arguments><literal-string length="1">,</literal-string></arguments>
    <!-- ... ... -->
    <return> <!-- return bstr.toString() -->
      <send message="toString">
        <target><var-ref name="bstr" idref="locvar-11045"/></target>

Note that I specifically designed features of the JavaML representation to support edges (via id/idref pairs) between uses of variables and their declarations (which include their static types). That means that I can convert the above XML into JavaScript, essentially transliterating from Java to JavaScript, in a type-aware way using XSLT and template precedence rules to specialize rewrites based on details of the AST. For example, these rules make sure that StringBuffer.append and .toString method invocations result in the appropriate JavaScript constructs of "+=" and a no-op, respectively:

  <xsl:apply-templates select="target"/>
  <xsl:text> += </xsl:text>
  <xsl:apply-templates select="arguments/*"/>

(The above template matches send elements with message attributes of 'append' that also have the target/var-ref/@idref pointing to a declaration that has a type element with name attribute of 'StringBuffer'.  I.e., apply this rule to all sends of the "append" message to variables with static type "StringBuffer".)

  <xsl:apply-templates select="target"/>

Those rules applied to the JavaML representation of the Java source code then result in this JavaScript:

  toString: function() {
    var bstr = "[";
    for (var i = 0; i < this._values.length-1; i++) {
      bstr += this._values[i];
      bstr += ",";
    // ...
    return bstr;

In this case, the converted code is perfect and ready to execute.  The actual process was a bit more laborious, involving iterating on specializing the XSLT transformation for each subsequently more complex Java source program in the full library and hand-editing the output to fix the places where it was not worth my while to make the XSLT smarter (and finding a few bugs in Jikes-JavaML and my XSLT transformations along the way).  I also had to make several choices of libraries and JavaScript langauge extensions as dependencies to the converted JavaScript.  For a simple object-oriented framework, I used the server-side MooTools, and for a true hashtable and hashset (i.e., one that can have arbitrary objects as keys, not just strings), I used jshashtable and jshashset (with some minor modifications to support an escapingEach mechanism to allow break and non-local returns since the algorithm uses those patterns throughout).

After about three weekend days of working on this, I ironed out the last of the (known) bugs (which turned out to be an error in my Java implementation that I'd fixed in the C++ implementation years ago).  Then I implemented the previously-mentioned bounded quadrilateral demo in JavaScript, learned about Touch events on iOS devices and made the demo work on my iPad and iPad2.

I hope developers out there are able to find creative uses of the constraint solver on their web pages and applications.  And maybe someone will improve the JavaScript implementation with something that truly looks and feels like JavaScript (rather than Java hiding behind a JavaScript surface syntax)!  Until then, I may port to JavaScript the string expression parser to make it easier to specify constraints just by writing something like "x + width < screen_width".  The dynamic expressiveness of JavaScript should be a huge help in experimenting with tighter integration with the rest of the environment, similar to my Scheme implementation and the way it interacted with the Scwm Window Manager.

Saturday, May 1, 2010

Skype as a whole-house Wireless Microphone for Home Automation

After buying a home (finally!) this past fall, I've been setting up a bunch of home automation capabilities. Recently I got far enough along that I started pondering how I might get a whole-house microphone system talking to my central (Windows XP-based) computer running HomeSeer (v2.0.4.36). I first considered xTag USB-Only System w/ One xTag Wearable Microphone, but the range of just tens of feet seemed way too limiting.

Then I had a thought... a wonderful, crazy thought... what if I could find a WiFi microphone -- one that used my existing 802.11n wireless network. After doing some quick searches, I realized that I already had such a thing: my Apple iPod touch (2nd generation since it needs a microphone on the hardware) and Skype! All I needed to do was use my iPod touch (or iPhone, or anything that has a decent Skype client) to call a Skype account running on my home automation server. I could then configure the sound output from that call to be the voice recognition input into HomeSeer's "Speaker" application that then controls HomeSeer. (Notably, the Speaker application actually need not be running on the same machine as HomeSeer, but I'm trying to be green by having only a single low-wattage computer always-on in my house, so I run it on the same machine.)

Now, if you've not been reading all my oh-so-infrequent posts, you may be thinking "hmmm, that sounds great, but how do you make the audio output from one program [Skype] become the microphone input of another program [Speaker]?"

Avid readers realize that in an earlier post I
did exactly that to Stream iTunes music to my SqueezeBox duet receiver. Using Virtual Audio Cable 4, I created two virtual audio cables -- I set up "Virtual Cable 1" as the default output channel for Speaker as shown in the first figure.

Then I set up "Virtual Cable 2" as the default microphone input -- you do that from the Control Panel -> Sound page. (It varies by Windows operating system variant.) After making those changes, you need to restart the Speaker client and at that point the Virtual Audio Cable's control panel will look like the below -- notice that there is 1 recording stream on Virtual Cable 2 -- that's the Speaker application that opened that cable as its microphone. If it's not showing 1 there, you need to try again to make that cable the default input and restart the Speaker application.

After that, I set up Skype on the home automation PC, created a distinct user account, set it up to only accept calls from my specific other Skype account, and made it auto-answer calls. I set the audio settings of Skype to have the microphone input be Virtual Cable 1 (the one that the Speaker application is doing text-to-speech on, since that input will be sent over to the speaker of my iPod touch so I can hear what the computer says to me) and to have the Skype speaker output setting be Virtual Cable 2 (the one that the Speaker application is using for voice recognition, since it's gonna play out what it hears from what I say into my microphone on my iPod Touch). The audio settings end up looking like this:

To test it, I made a call from my iPod Touch to that new Skype account. The Virtual Audio Cable control panel comes in handy here again, as it should, after the call is connected, look like:

Notice that now there is a recording stream on Virtual Cable 1 (Skype reading its microphone input, the output of the Speaker application) and a playback stream on Virtual Cable 2 (Skype writing its speaker out to the input of the Speaker application).

From that point, I'm able to give voice commands to HomeSeer via my iPod Touch. I had to retrain the voice recognizer (Windows Control Panel->Speech) for this new setup, even just to get Speaker to recognize the attention phrase. It'd be nice to not require the attention phrase and to simply take a command immediately upon Skype answering a call, but I'm not an expert with Speaker so I'm not sure it's possible.

I was also able to get this working using my new HTC Droid Incredible Android Phone, and again had to retrain voice recognition and microphone levels with that set up. Unfortunately (and surprisingly) Verizon seems to only let its Skype Mobile run over 3G and not over WiFi... this seems really insane especially for the substantially more open Android platform -- remember, Skype exists in a more full-featured form for the Apple-controlled iPhone. Regardless, I got VoIP over WiFi working using Fring and its Skype plug-in.

All in all, I'm pretty psyched to now have my whole-house microphone setup! Now if only HSTouch on the iPod Touch worked with the SqueezeBox plug-in without crashing... then I might be able to control my audio using my voice (I still use the awesome iPeng application to do that, and waiting for an Apple iPad Tablet version of that app).

Saturday, May 23, 2009

iPhone as a Universal Remote Using SqueezeBox Duet Controller IR

After the grand machinations I described in my last post to enable me to use Apple's Remote iPhone application to control music around my house through the SqueezeBox Duet Receiver, a commenter pointed out the then-new iPeng application. That app lets my iPod Touch directly control SqueezeCenter on my PC to manage my various SqueezeBox Receivers, and is exactly what I was pining for. In the ensuing months, I transitioned exclusively to using the iPeng application, which left me with the question: what do I do with my now oh-so-dusty SqueezeBox Duet Controllers.

The answer was easy once I took a closer look at the hardware of those controllers... something I'd not noticed before jumped out at me: an Infrared (IR) transmitter! The IR transmitter in its shipped configuration doesn't do anything -- an intron component -- but thanks to Google and various threads like this one on the slimdevices community forums, I discovered that there is rudimentary driver support for the IR device and some trivial samples of using that device. Given that I already knew the controller runs a BusyBox-based embedded Linux, I knew how I'd challenge myself that evening --- no turning my Sony IR-controlled TV on until I could do so from my PC!

I spent the next while reading a bit about Infrared signalling protocols and conventions, and then I built a trivial little script that just called the Duet Controller's /bin/testir via ssh with a command line full of hex codes that I carefully crafted based on my new-found partial competence. I announced to the room "Here we go!" as I dramatically pressed [Enter]......


I rinsed and repeated many times, alternating between worry that the degrees of freedom in command specifications were too great and celebrating short-lived epiphanies that narrowed that search space dramatically. Finally, I decided to pay more attention to the "repeat" parameter that the Linux Infrared Config files mention in passing -- that some devices only recognize an IR command when sent two or more times juxtaposed. I saw that my Sony LIRC file used that option, so I doubled the hex codes corresponding to the command part in the line I was trying, and gave a surprisingly hopeful and still dramatic "And then there was..." as I once again lowered my hand onto [Enter]....


The hum of my (awesome, but now 6 years dated) Sony HD CRT Monitor warming up was music to my ears! How cool! With that magical ~200 character command line, I then knew that I could use the SqueezeBox Duet Controller as the IP-controlled IR transmitter portion of an iPod-Touch (or iPhone) Infrared Universal Remote solution!

The rest of the path was simple compared to the reverse-engineering and experimental IR learnings. I educated myself on Lua and Jive (now known as SqueezeOS), the embedded programming language and the Framework, respectively, used by the SqueezeBox controllers. Then I set out to code a small Lua applet to run on the controller. I called it IRServer and the design was simple: an HTTP-like server that just accepts simple commands on a TCP port, strips them of their HTTP artifacts (if any) and issues the appropriate os.exec call to run the C program that drives the IR device driver. By making the server understand HTTP just enough, I knew I could even just write a static HTML page that issued commands when links or buttons were clicked (e.g., just requesting an Image object at a URL corresponding to the command I wanted to issue to IrServer).

I spent a good part of the following weekend pulling the above little design together, along with a simple Perl script that takes a set of Linux IR Config files and turns them into HTML with a link per defined IR button (don't worry, links on that page won't control my devices when you click -- the page assumes you're on the LAN with the controller). I even played around with IUI, the iPhone User Interface framework, and started making a much more iPhone-friendly UI for a bunch of the devices in my living room.

Of course, the good news is I put this project -- IrServerSB (SB for SqueezeBox) -- up on The README file at has more information about how to try it, and I've set up a Google Groups mailing mailing list (or use the code project page, or comments here) for feedback.

The better news will be if others from the community can take this the rest of the way towards being a complete solution. In particular, RC5/RC6 remotes are reportedly not supported by the Jive IR platform and I can confirm that numerous attempts to turn on my RC6-controlled XBox 360 left me without a satisfying Eureka moment. I'd love to have that supported, and perhaps even build the server as a background-running C server.