Skip to content

{ Tag Archives } hadoop

Tech Talk: Nathan Marz — “Clojure at BackType”

Clojure at BackType Nathan Marz (BackType) Tuesday, October 26, 2010 ABSTRACT Clojure has led to a significant reduction in complexity in BackType’s systems. BackType uses Clojure all over the backend, from processing data on Hadoop to a custom database to realtime workers. In this talk Nathan will give a crash course on Clojure and using [...]

Also tagged

Tech Talk: Klaas Bosteels — “Hadoop at Last.FM”

Hadoop at Last.FM Klaas Bosteels (Last.FM) Monday, June 28, 2010 ABSTRACT This talk is about the usage of Hadoop at Last.fm, a community-driven music discovery website. We will go through the main types of data Last.fm stores in Hadoop, explain why we need Hadoop to store and process our data, give examples of what we [...]

Also tagged

Tech Talk: Sam Rash (Facebook) — “Low Latency Message Bus With Scribe and HDFS”

Sam Rash from Facebook came by and talked to us about the how they provide near-realtime access to data logged from Scribe into HDFS. Very fascinating. Enjoy! Low Latency Message Bus with Scribe and HDFS Sam Rash (Facebook) Tuesday, August 31, 2010 ABSTRACT This talk covers the Data Freeway project at Facebook, which centers around [...]

Also tagged

Building Voldemort read-only stores with Hadoop

A well-known lesson in scalability is that writes are 40x more expensive than reads and if your application becomes write-intensive as it is easily the case when you are dealing with sufficiently large number of users, you will be in trouble if you don’t design to scale. For example, if you are using MySQL, you [...]

Also tagged ,

Building a terabyte-scale data cycle at LinkedIn with Hadoop and Project Voldemort

Many of LinkedIn’s products are critically dependent on computationally intensive data mining algorithms. Examples of these include some modules like People You May Know, Viewers of This Profile Also Viewed, and much of the Job matching functionality that we give to people who post jobs on the site. To support these data-intensive products we have [...]

Also tagged ,