<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Netuality &#187; EC2</title>
	<atom:link href="http://www.netuality.ro/tag/ec2/feed" rel="self" type="application/rss+xml" />
	<link>http://www.netuality.ro</link>
	<description>Taming the big, bad, nasty websites</description>
	<lastBuildDate>Mon, 07 Nov 2011 16:36:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Linkdump: Cassandra lovers, blowing the circuit breaker and Oracle clouds</title>
		<link>http://www.netuality.ro/linkdump-cassandra-lovers-blowing-the-circuit-breaker-and-oracle-clouds/linkdump/20100304</link>
		<comments>http://www.netuality.ro/linkdump-cassandra-lovers-blowing-the-circuit-breaker-and-oracle-clouds/linkdump/20100304#comments</comments>
		<pubDate>Thu, 04 Mar 2010 18:31:13 +0000</pubDate>
		<dc:creator>Adrian</dc:creator>
				<category><![CDATA[Linkdump]]></category>
		<category><![CDATA[Cassandra]]></category>
		<category><![CDATA[EC2]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HBase]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://www.netuality.ro/?p=181</guid>
		<description><![CDATA[Good points (as always) on Alexandru&#8217;s blog discussing the SQL scalability isn&#8217;t for everyone topic. NoSQL as RDBMS are just tools for our job and there is nothing about the death of one of the other. But as we’ve learned over years, every new programming language is the death of all its precursors, every new [...]]]></description>
			<content:encoded><![CDATA[<p>Good points (as always) on Alexandru&#8217;s blog discussing the <a href="http://nosql.mypopescu.com/post/424164220/sql-is-scalable-sql-scalability-isnt-for-everyone" target="_blank">SQL scalability isn&#8217;t for everyone</a> topic.</p>
<blockquote><p>NoSQL as RDBMS are just tools for our job and there is nothing about the  death of one of the other. But as we’ve learned over years, every new  programming language is the death of all its precursors, every new  programming paradigm is the death of everything that existed before and  so on. The part that some seem to be missing or ignoring deliberately is  that in most of these cases this death have never really happened.</p></blockquote>
<p>For large-scale performance testing of a production environment check out how <span style="text-decoration: line-through;">Facebook</span> MySpace <a href="http://highscalability.com/blog/2010/3/4/how-myspace-tested-their-live-site-with-1-million-concurrent.html" target="_blank">simulated 1 million concurrent users</a> with a huge EC2 cluster, described on the High Scalability blog. While the article is a guest post from a company selling &#8220;cloud testing&#8221; solutions and has a bit of &#8220;sales juice&#8221; in it, it&#8217;s still a very good read:</p>
<p style="text-align: center;"><img class="aligncenter" title="Large-scale testing using EC2" src="http://farm3.static.flickr.com/2776/4405976247_0fd13b6f26.jpg?__SQUARESPACE_CACHEVERSION=1267718646170" alt="Large-scale testing using EC2" width="500" height="342" /></p>
<p>Someone is <a href="https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/" target="_blank">in love with Cassandra</a> after only 4 months. Hoping Cassandra doesn&#8217;t get too fat after the wedding:</p>
<blockquote><p>Traditional sharding and replication with databases like MySQL and   PostgreSQL have been shown to work even on the largest scale websites —   but come at a large operational cost. Setting up replication for MySQL   can be done quickly, but there are many issues you need to be aware of,   such as slave replication lag. Sharding can be done once you reach  write  throughput limits, but you are almost always stuck writing your  own  sharding layer to fit how your data is created and operationally,  it  takes a lot of time to set everything up correctly. We skipped that  step  all together and added a couple hooks to make our data aggregation   service siphon to both PostgreSQL and Cassandra for the initial   integration.</p></blockquote>
<p><a href="http://www.anders.com/cms/282/Distributed.Data/Hadoop/Hbase/Hive" target="_blank">Distributed data war stories</a> from Anders @ bandwidth.com, HBase and Hadoop on commodity hardware:</p>
<blockquote><p>As mentioned before, the commodity machines I used were very basic but I  was able to insert conservatively about 500 records per second with  this setup. I kept blowing the circuit breaker at the office as well  forcing me to spread the machines across several power circuits but it  proved that the system was at least fault tolerant!</p></blockquote>
<p><a href="http://www.thebitsource.com/software-engineering/python/sourceforgenet-chooses-python-turbogears-and-mongodb-to-redesign-their-web-site/" target="_blank">SourceForge chooses Python, TurboGears and &#8230; MongoDB</a> for a new version of their website. Looks like Mongo is becoming quite mainstream.</p>
<p>Don&#8217;t believe the rumors, <a href="http://blogs.forrester.com/appdev/2010/03/oracle-has-a-cloud-strategy-after-all.html" target="_blank">Oracle is into cloud computing after all</a> &#8211; at least according to Forrester. Well, as long as the clouds are private. And as long as you can live with &#8220;coming soon&#8221; tooling. And it&#8217;s not like they really have a clear long-term strategy for cloud computing:</p>
<blockquote><p>I believe that cloud is a revolution for Oracle, IBM, SAP, and the other big  vendors with direct sales forces (despite what they say). Cloud computing has the  potential to undermine the account-management practices and pricing models these big companies are  founded on. I think it will take years for each of the big vendors to adapt to cloud computing. Oracle is just beginning this journey; I think other  vendors are further down the track.</p></blockquote>
<p>The igvita blog hits NoSQL in the groin by <a href="http://www.igvita.com/2010/03/01/schema-free-mysql-vs-nosql/" target="_blank">showing a simple way of having a schema-free data store</a> &#8230; in MySQL. It&#8217;s a sort of proxy that translates schemas into denormalized data placed in distinct tables:</p>
<blockquote><p>Instead of defining columns on a table, each attribute has its own table  (new tables are created on the fly), which means that we can add and  remove attributes at will. In turn, performing a select simply means  joining all of the tables on that individual key. To the client this is  completely transparent, and while the proxy server does the actual work,  this functionality could be easily extracted into a proper MySQL engine  &#8211; I’m just surprised that no one has done so already.</p></blockquote>
<p>While an interesting idea, not sure how effective this will be in practice, as joins are among the most time-consuming operations in the database world. I&#8217;m pretty sure that replacing a 10-column table get on the primary key with joins on 10 tables will add an important overhead.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.netuality.ro/linkdump-cassandra-lovers-blowing-the-circuit-breaker-and-oracle-clouds/linkdump/20100304/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Benchmarking the cloud: not simple</title>
		<link>http://www.netuality.ro/benchmarking-the-cloud-not-simple/datacenter/20100118</link>
		<comments>http://www.netuality.ro/benchmarking-the-cloud-not-simple/datacenter/20100118#comments</comments>
		<pubDate>Mon, 18 Jan 2010 07:02:31 +0000</pubDate>
		<dc:creator>Adrian</dc:creator>
				<category><![CDATA[Datacenter]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[benchmark]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[EBS]]></category>
		<category><![CDATA[EC2]]></category>
		<category><![CDATA[Rackspace]]></category>

		<guid isPermaLink="false">http://www.netuality.ro/?p=155</guid>
		<description><![CDATA[Understanding the impact of using virtualized servers instead of real ones is perhaps one of the most complex issues when migrating from a traditional configuration to a cloud-based setup. Especially because virtualized servers are created equal &#8230; but only on paper. A Rackspace-funded &#8220;report&#8221; tries to find out the performance differences between Rackspace Cloud Servers [...]]]></description>
			<content:encoded><![CDATA[<p>Understanding the impact of using virtualized servers instead of real ones is perhaps one of the most complex issues when migrating from a traditional configuration to a cloud-based setup. Especially because virtualized servers are created equal &#8230; but only on paper.</p>
<p>A Rackspace-funded &#8220;report&#8221; tries to find out <a href="http://www.thebitsource.com/2010/01/11/rackspace-cloud-servers-versus-amazon-ec2-performance-analysis/" target="_blank">the performance differences</a> between Rackspace Cloud Servers and Amazon EC2. I guess the only conclusion we can get from their so-called report is that Cloud Server disk throughput is better than EC2&#8242;s. As the &#8220;CPU test&#8221; is a kernel compile which also stresses the disk, I don&#8217;t think we can reliably get any conclusion from these.</p>
<p style="text-align: center;"><img class="size-full wp-image-156 aligncenter" title="rackspace_amazon_benchmark" src="http://www.netuality.ro/wp-content/uploads/2010/01/rackspace_amazon_benchmark.gif" alt="" width="600" height="275" /></p>
<p>An <a href="http://www.thebitsource.com/2010/01/11/rackspace-cloud-servers-versus-amazon-ec2-performance-analysis/#IDComment52135232" target="_blank">intrepid commenter</a> ran a CPU-only test (Geekbench) and found out that <a href="http://browse.geekbench.ca/geekbench2/view/203592" target="_blank">EC2</a> performs slightly better than <a href="http://browse.geekbench.ca/geekbench2/view/187589" target="_blank">Rackspace</a> in terms of raw processor performance. The same commenter, affiliated with <a href="http://cloudharmony.com/status" target="_blank">Cloud Harmony</a>,  mentions that a simple hdparm test shows that Rackspace hdd has more than twice the throughput of EC2 hdd, at least in terms of buffered reads. Last but not least, don&#8217;t forget that for better disk performance Amazon recommends <a href="http://blog.rightscale.com/2008/08/20/amazon-ebs-explained/" target="_blank">EBS</a> instead of the VM disk.</p>
<p>We cannot reliably make an informed cloud vendor choice just using VM benchmarks. Ideally, you should benchmark your own app on each cloud infrastructure and choose the one which gives you the best user-facing performance, because at the end of the day this is what matters most. Sadly, today this means experimenting with sometimes wildly different APIs and provisioning models.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.netuality.ro/benchmarking-the-cloud-not-simple/datacenter/20100118/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>January 13 linkdump: KDD, EC2 congested, Coherence, Zimbra</title>
		<link>http://www.netuality.ro/january-13-linkdump-kdd-ec2-congested-coherence-zimbra/linkdump/20100113</link>
		<comments>http://www.netuality.ro/january-13-linkdump-kdd-ec2-congested-coherence-zimbra/linkdump/20100113#comments</comments>
		<pubDate>Wed, 13 Jan 2010 17:23:08 +0000</pubDate>
		<dc:creator>Adrian</dc:creator>
				<category><![CDATA[Linkdump]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[Coherence]]></category>
		<category><![CDATA[EC2]]></category>
		<category><![CDATA[KDD]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Yahoo]]></category>
		<category><![CDATA[Zimbra]]></category>

		<guid isPermaLink="false">http://www.netuality.ro/?p=152</guid>
		<description><![CDATA[Call to arms for the annual ACM KDD Conference. KDD stands for Knowledge Discovery and Data Mining, so if you&#8217;re looking for some hardcore use cases and new algorithms to apply, this is definitely the place to be (Washington, July 25-28): KDD-2010 will feature keynote presentations, oral paper presentations, poster sessions, workshops, tutorials, panels, exhibits, [...]]]></description>
			<content:encoded><![CDATA[<p>Call to arms for the annual <a href="http://www.kdd2010.com/" target="_blank">ACM KDD Conference</a>. KDD stands for Knowledge Discovery and Data Mining, so if you&#8217;re looking for some hardcore use cases and new algorithms to apply, this is definitely the place to be (Washington, July 25-28):</p>
<blockquote><p>KDD-2010 will feature keynote presentations, oral paper presentations, 			poster sessions, workshops, tutorials, panels, exhibits, demonstrations, 			and the KDD Cup competition.</p></blockquote>
<p>There&#8217;s rumor on the street that Amazon EC2 is over-subscribed. <a href="http://alan.blog-city.com/has_amazon_ec2_become_over_subscribed.htm#" target="_blank">From the trenches</a> it appears that their scalability is &#8230; well, duh &#8230; not infinite and elasticity is a tiny bit rigid:</p>
<blockquote><p>Anyone that uses virtualized computing, whether it is in the cloud or in their own private setup (VMWare for example) knows you take a performance hit. These performance hits can be considerable, but on the whole, are tolerable and can be built into an architecture from the start.</p>
<p>The problems that we are starting to see from Amazon, are more than just the overhead of a virtualized environment. They are deep rooted scalability problems at their end that need to be addressed sooner rather than later.</p></blockquote>
<p>My Adobe colleague <a href="http://horicky.blogspot.com" target="_blank">Ricky Ho</a> has <a href="http://horicky.blogspot.com/2010/01/notes-on-oracle-coherence.html" target="_blank">posted some notes on Oracle&#8217;s Coherence</a> (formerly Tangosol), a distributed Java cache rich in features. A great read especially if you want a technical intro to the product (code snippets and everything).</p>
<p>The acquisition of the day is <a href="http://paidcontent.org/article/419-confirmed-yahoo-sells-zimbra-to-vmware/" target="_blank">Zimbra being bought by VMWare</a>. Yahoo is selling Zimbra a loss, it seems. Analysts wonder what exactly is VMWare planning to do, well they&#8217;re probably going up the stack and working on providing their own cloud ecosystem and related services. &#8220;VMWare Applications&#8221;, soon?</p>
<blockquote><p>Under the terms of the agreement, Yahoo can continue to use Zimbra technology in its communications services.  					<a name="#keep_reading"></a> VMWare’s interest in Zimbra is a bit of a mystery since VMWare focuses on selling virtualization technology; in the release, VMWare offers somewhat of an explanation saying that the purchase furthers its “mission of taking complexity out of the datacenter, desktop, application development and core IT services”</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://www.netuality.ro/january-13-linkdump-kdd-ec2-congested-coherence-zimbra/linkdump/20100113/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

