But Can I Kill My Liver AND My Pancreas?

Like how Manischewitz was the first alcohol for candy lovers.

Why am I just learning that this exists as an empty box? I feel like I just found a dead unicorn. I also like that “The First Candy for Beer Lovers” implies that beer lovers refers not to people who merely enjoy beer, but specifically to people who cannot stand the taste of any other food (candy included) unless it also tastes like beer. Because who can choke down the disgusting taste of chocolate, sugar, and butter so prevalent in today’s “regular” candy? Finally, our days of dunking Snickers in Sam Adams are over. Thanks, Beercandy!

Continue reading But Can I Kill My Liver AND My Pancreas?

Share

Why I <3 RStudio

logo

I’ve just barely started playing with RStudio, but I’ve already decided to make it my main R IDE. It’s taken so many of the tasks that used to frustrate me in the standard distribution and made them super simple. It’s nearly identical to using R but with a lot of new bells and whistles so, if you’re going to use R, you may as well use RStudio. That said, here are some of my pros and cons so far.

Continue reading Why I <3 RStudio

Share

Strata Round Up Part 3: Tools (and some services) You Should Know

Screen-shot-2011-02-08-at-12.51.32-PM

In the final part of my Strata recap I wanted to talk about the vast array of scraping, cleansing, graphing, plotting, visualizing, sharing, selling, searching, filtering, and otherwise gerunded tools that people showcased or used in their talks. As I mentioned in my first post, we are living in an intensely exciting time in which we have unprecedented access, borne by the power of the Internet and a thriving open source community, to datasets and tools for working with data. I was overwhelmed with the number of languages and software libraries people were using to chop up, remix, and process their data and, I have to admit, I felt a bit out of touch sitting there with C++ dangling on my finger like a stuck yo-yo while people did aerial acrobatics with Ruby and put on pyrotechnics shows with Tableau around me. This post is both an attempt for me to outline the tools I should at least be familiar with, if not using on a daily basis, as well as to save anyone who’s reading the time to round up the latest blades in the data scientist’s swiss army knife themselves. Full disclosure: I haven’t used many of these, so I can’t attest to their quality any more than to say that they intrigue me. That said, here are the data tools (and some services) from Strata that I think you (read: I) should know.

Continue reading Strata Round Up Part 3: Tools (and some services) You Should Know

Share

Short-term Opening for C/R + Stats Person

jobOpening

My friends over at (media company who wants to remain anonymous) are looking for someone with good C and R knowledge to help them add some features into their software. Stats background is pretty much a must-have as the project will entail navigating and building on code that deals with statistical modeling and parameter estimation. The new features you’d be adding would also involve some stats know-how as well as the coding chops to implement them in C for use in R.

Continue reading Short-term Opening for C/R + Stats Person

Share

Strata Round Up Part 2: 5 Keynotes You Should Watch

Screen-shot-2011-02-08-at-12.51.32-PM

There were a load of really great talks at the Strata Big Data conference (and certainly a fair share of pitches), so I wanted to distill a shortlist of keynotes that I found particularly inspiring.  You can find the entire body of Strata videos and slides at Strata’s main site and I’d encourage anyone who’s interested to peruse the wonderful collection O’Reilly has put together there.  If you just want a quick snapshot though, I’d say look to these.

Continue reading Strata Round Up Part 2: 5 Keynotes You Should Watch

Share

Strata Round Up Part 1: Overview and Takeaways

Screen-shot-2011-02-08-at-12.51.32-PM

I spent a couple of days at the Strata Big Data conference in lovely Santa Clara, California the other week talking shop about massive datasets and what to do with them. I had a wonderful time rubbing elbows with all the smart and interesting people out there with “data scientist” on their cards, but I was struck by the common theme that none of us really knew exactly what that, or “big data”, for that matter, really meant. With interest in “data science” buzzing, I thought I’d give a quick review of some high level ideas that I took away from the conference and what they say about the future of data and data science.

Continue reading Strata Round Up Part 1: Overview and Takeaways

Share

Yahoo! Director of Data Insights Job Opening

To apply: please send resume to seemah@yahoo-inc.com<mailto:seemah@yahoo-inc.com>

Director, Data Insights (Data Solutions and Insights Group)

Location: Sunnyvale, CA

About Yahoo! Think about impacting 1 out of every 2 people online–in innovative and imaginative ways that are uniquely Yahoo!. We do just that each and every day, and you could too. After all, it’s big thinkers like you who will create the next generation of Internet experiences for consumers and advertisers across the globe. Now’s the time to show the world what you’ve got. Put your ideas to work for over half a billion people.

Did you know that: * Yahoo! serves over half a billion unique users * Yahoo!’s user data warehouse is one of the largest in the world … in the order of ‘petabytes’ * Yahoo! collects over 10 terabytes of click stream behavioral data each day (the equivalent of all the information stored in the Library of Congress)

How Big Can You Think? Yahoo! is looking for a Director of Data Insights to lead a star-studded team responsible for turning our rich data assets into actionable insights to further refine our consumer and advertising products, and business strategies

DATA is a critical element to Yahoo!s future success and influences every key decision that the company makes. The overall UDA Insights organization, comprised of Data Insights, Business Analytics, and the Experimentation Modeling teams, is responsible for enabling data-driven decision making and influencing decisions, and is at the very core of this process.

As Director of the Data Insights team, you will be responsible for establishing and managing a comprehensive applied analytics program which is aligned with company priorities and strategic pillars.

As a senior member of the Insights organization, you will work closely with the company’s leadership team, and also interact with peers across the User Data Analytics (UDA) organization as well as the larger analyst community.

Responsibilities: Lead a team of talented Insights managers/analysts Draw upon deep product, business, and marketing experience and acumen to scope and execute relevant, focused studies which produce actionable insights, influence decisions, and produce tangible results. Build and manage a high-impact roadmap in close collaboration with key client organizations. Consistently deliver upon commitments on time and with quality and proactively manage client expectations. From a data analytics perspective, serve as a trusted advisor to Product, Product Marketing teams, and associated internal organizations on matters related to user Acquisition, Engagement, and Revenue Growth. Develop and maintain solid relationships with senior clients. Act as a leading member of the company’s global analytics community, establishing means of collaborating in order to achieve goals. Be a strong advocate for the data needs of the team and secure support from partner organizations. Provide support to internal clients in the areas of Metrics/KPI development as needed.

Minimum Job Qualifications: * 8 years of experience in quantitative market research / analytics in an online environment * Solid experience leading teams of 5+ analysts and research managers * Strong familiarity with statistical tools as well as deep experience working with large datasets * Outstanding communication and presentation skills * Proven ability to design and lead time-sensitive strategic research projects involving goal setting, requirements gathering, methodology development, analysis, strategic recommendations and presentation of findings to Senior Management, and follow up. * Strong attention to detail and ability to effectively manage multiple projects * Strong Web Product and Business acumen. * An MBA or advanced degree in an analytical field, such as Computer Science, Mathematics or Economics

Yahoo! Inc. is an equal opportunity employer. For more information or to search all of our openings please visithttp://careers.yahoo.com<http://careers.yahoo.com/>.

To apply: please send resume to seemah@yahoo-inc.com<mailto:seemah@yahoo-inc.com>

Share

NYT Flickr Tagger

main_room

I’d been meaning to write about last December’s TimesOpen Hack Day at the New York Times for a while but got behind and never got around to it. It was a really great experience that got a lot of intimidatingly smart and creative people together to pull off some MacGyver-esque data and software hacks that ultimately resulted in a lot of really cool and useful projects. You can read more about the actual Hack Day here, but I wanted to chime in about a project I worked on with Flickr’s Chris Martin (@cjmartin) that combines New York Times linked data with Flickr images. I thought it was a pretty cool tool and welcome anyone to try it out and let us know what you think. More after the jump.

Continue reading NYT Flickr Tagger

Share

C4 Paper Accepted to PAMI

C4 (3)

Great news if you like journal papers about constrained graph optimization: A paper I wrote with Song Chun Zhu on a new algorithm to do just that, titled “C4: Exploring Multiple Solutions in Graphical Models by Cluster Sampling” was recently accepted to Transactions on Pattern Analysis and Machine Intelligence (PAMI) and will be published in an upcoming issue. Is the suspense already killing you? Well fret not, you can be the coolest kid on your block and get the pre-publication draft right now here, with appendix here. Enjoy!

Continue reading C4 Paper Accepted to PAMI

Share

Fact Checking the MTA

Believe It! (from https://www.blogger.com/blogin.g?blogspotURL=http://bitchcakescommutes.blogspot.com/2009_03_01_archive.html)

From the NYC MTA: “Believe it or not. In 1986 the subway and bus fare was $1. That’s $1.89 in 2008 dollars. Today a 30-day Unlimited Ride MetroCard brings the fare down to $1.17. Believe it.” Should we believe it?

Continue reading Fact Checking the MTA

Share