<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Netuality &#187; myNoSQL</title>
	<atom:link href="http://www.netuality.ro/tag/mynosql/feed" rel="self" type="application/rss+xml" />
	<link>http://www.netuality.ro</link>
	<description>Taming the big, bad, nasty websites</description>
	<lastBuildDate>Mon, 07 Nov 2011 16:36:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>January 12 linkdump: Reddit on Hadoop on steroids, Hadoop lessons learned</title>
		<link>http://www.netuality.ro/january-12-linkdump-reddit-on-hadoop-on-steroids-hadoop-lessons-learned/linkdump/20100112</link>
		<comments>http://www.netuality.ro/january-12-linkdump-reddit-on-hadoop-on-steroids-hadoop-lessons-learned/linkdump/20100112#comments</comments>
		<pubDate>Tue, 12 Jan 2010 18:25:23 +0000</pubDate>
		<dc:creator>Adrian</dc:creator>
				<category><![CDATA[Linkdump]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[myNoSQL]]></category>
		<category><![CDATA[ReadPath]]></category>
		<category><![CDATA[Reddit]]></category>

		<guid isPermaLink="false">http://www.netuality.ro/?p=148</guid>
		<description><![CDATA[Great Hadoop story, and a great read too, from Lau Jensen on Best In Class blog: Hadoop opens a world of fun with the promise of some heavy lifting and in order to feed the beast I’ve written a Reddit-scraper in just 30 lines of Clojure. [...] Now that we’re sitting with almost unlimited insight [...]]]></description>
			<content:encoded><![CDATA[<p>Great Hadoop story, and a great read too, from Lau Jensen on <a href="http://www.bestinclass.dk/index.php/2010/01/hadoop-feeding-reddit-to-hadoop/" target="_blank">Best In Class blog</a>:</p>
<blockquote><p>Hadoop opens a world of fun with the promise of some heavy lifting and in order to feed the beast I’ve written a Reddit-scraper in just 30 lines of Clojure.</p>
<p>[...]</p>
<p>Now that we’re sitting with almost unlimited insight into the posts which make Redditors tick, we can think of many stats that would be fun to compute. Since this is a tutorial I’ll go with the simplest version, ie. something like calculating total number of upvotes per domain/author, but for a future experiment it would be fun to pull out the top authors/posts and also scrape the URLs they link, categorizing them after content length, keywords, number of graphical elements etc, just to get the recipe for a succesful post.</p></blockquote>
<p>Alex Popescu has <a href="http://nosql.mypopescu.com/post/330657421/lessons-learned-from-using-hadoop-and-hbase-in" target="_blank">a few notes and questions</a> about <a href="http://www.readpath.com/" target="_blank">ReadPath</a> <a href="http://blog.readpath.com/2009/12/28/hadoop-and-hbase-in-production/" target="_blank">usage of Hadoop</a> in production:</p>
<blockquote><p>If you thought using NoSQL solutions would automatically address and solve backup and restore policies, you were wrong. [...]</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.netuality.ro/january-12-linkdump-reddit-on-hadoop-on-steroids-hadoop-lessons-learned/linkdump/20100112/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

