Widgets Magazine

ACM hosts Hulu talk on ‘big data’

The Stanford Association for Computing Machinery (ACM) hosted its weekly Tech Talk last Friday night in the Gates Computer Science Building, featuring Shane Moriah ’09, a Hulu software developer who gave a talk titled, “I Know What You Watched Last Summer: An Overview of Hadoop, Big Data and Metrics at Hulu.”

ACM’s weekly Tech Talks feature two students, faculty or community members giving talks about various computer science-related topics. The group’s website deems the idea “simple–smart people sharing their cool CS hacks, research and tech demos with the Stanford CS community.” This week’s second speaker was Sam King ’12 on “Computers, Exploitation and Empowerment.”

Hulu software developer Shane Moriah '09 gives a Tech Talk as part of the Association for Computing Machinery lecture series. (WENDING LU/The Stanford Daily)

Moriah spoke on the relevance to data in almost every industry.

“Metrics are the core of data-driven business,” he said. “Pretty much every business now–every tech business, every non-tech business–is interested in metrics.”

“You can understand your users and usage, develop and track new existing products–and, of course, there’s money,” he continued. “We make most of our money off of advertisements, so being able to track what advertisements are shown and when–that’s how we make our money.”

According to Moriah, data is used to track a variety of factors across the Hulu site: views per video, number of advertisements shown, quality of service, as well as “individual viewer data,” or how people behave across the lifetime of their visit to the site.

He also described how the company collects data through “beacons” on web pages designed to track usage.

“What you do [when you watch a video] is send this URL off,” he said. “This is basically what happens when you load an advertisement…it gets sent back to us and processed.”

The company records the time, IP address, advertisement info, user location and user ID for registered viewers using a URL-parsing key; it processes about one billion of these “beacons” per day, which Moriah called an “issue of scale.”

Data is also used to balance content and user interface, as well as to drive some larger company decisions such as “negative results” communicating unsuccessful initiatives.

“When we launch things, there is a tendency to do product design by gut, Steve Jobs’ philosophy,” he said. “That sometimes works…Sometimes you see that your big dream projects should actually get axed.”

He also revealed that Hulu is “in the slow process of a site-redesign, which is perhaps a secret.”

The Stanford ACM, which Hackathon coordinator Alex Atallah ’14 calls a “computer enthusiast” group, is a chapter of the nationwide scientific and educational computing society. The Stanford chapter was founded in 2007 and has approximately 400 members; nationwide, membership was about 83,000 as of 2007.

The group reaches out to companies and students, inviting them to speak about whatever CS-related subject interests them. According to Atallah, the talks’ audiences are fairly consistent.

“The vast majority are Stanford people, Stanford undergrads,” he said. “It’s maybe five percent from outside of Stanford and 90 percent undergrads. The people who come from outside of Stanford, some of them are entrepreneurs…Others are just tech enthusiasts. They’re really excited about it.”

He also said that attendees tend to be CS majors, although some are symbolic systems majors or those interested in “general entrepreneurship.”

As for ACM, Atallah described the group’s mission as built around a triad.

“ACM is the biggest computing enthusiast group on Stanford campus, and we try to get people together, some of whom are not sure how to start an idea that they have, and some of them have the technical background, but want inspiration,” he said. “We try to get these people to create at hackathons, to listen at Tech Talks and to learn at workshops.”

ACM’s next Tech Talk is Friday, Oct. 21 at 6 p.m. and will feature Ali Yahya ’10 on “Software Defined Networking.”