<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Control Group &#187; database</title>
	<atom:link href="http://blog.controlgroup.com/tag/database/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.controlgroup.com</link>
	<description>Technology for Big Ideas.</description>
	<lastBuildDate>Tue, 31 Jan 2012 15:14:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Big Data: SQL Planning &amp; Migration to Spark and Hadoop</title>
		<link>http://blog.controlgroup.com/2011/10/11/solving-big-data-problems-requires-big-solutions/</link>
		<comments>http://blog.controlgroup.com/2011/10/11/solving-big-data-problems-requires-big-solutions/#comments</comments>
		<pubDate>Tue, 11 Oct 2011 13:19:54 +0000</pubDate>
		<dc:creator>David Rocamora</dc:creator>
				<category><![CDATA[general]]></category>
		<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[migration]]></category>
		<category><![CDATA[spark]]></category>
		<category><![CDATA[SQL planning]]></category>

		<guid isPermaLink="false">http://blog.controlgroup.com/?p=1686</guid>
		<description><![CDATA[I was in a meeting the other day discussing a problem that a client keeps running into.  They need a platform to analyze trends in a rapidly growing data set, where the criteria is changing as fast as their business is changing, which as it turns out, is pretty fast.]]></description>
			<content:encoded><![CDATA[<p>I was in a meeting the other day discussing a problem that a client keeps running into.  They need a platform to analyze trends in a rapidly growing data set, where the criteria is changing as fast as their business is changing, which as it turns out, is pretty fast.  Right now they are storing the data in a relational database and writing complex SQL queries to mine information from it. The DBA told us that he would run a query and then go to lunch, hoping it would be done by the time he gets back. They need the results faster, and they know that their problem is just going to get worse as the data grows.</p>
<p>The kneejerk reaction to a problem like this is to get a bigger database server. Sure, this may help right now when the data is only a few hundred gigabytes, but what happens when we are dealing with a few hundred terabytes? A few hundred petabytes? This kind of solution just does not scale.</p>
<p>The real answer here is to step back, examine the problem, understand what the goal is, and then design a process that can achieve that goal. In this case, the problem is that a business needs to be able to understand patterns and trends in a rapidly growing data set. The goal is to be able to do this quickly and consistently even as the data grows. One process that can achieve this is by using something like <a href="http://hadoop.apache.org/">Hadoop</a> or <a href="http://www.spark-project.org/">Spark</a> to build a cluster that can scale as the data scales.</p>
<p>There were concerns as soon as I brought this up; What about the schema? How do you write SQL for that? Why not just shard the database? Some of these concerns may be valid, but I feel we must evaluate this without emotion. Do people want to use the relational database because it is a better solution for the problem or because they feel comfortable with it?</p>
<p>I’m not sure it’s accurate to say that we are facing new problems these days, but the shape and size of our problems have changed. Now even the smallest company has something to gain from working with big data&#8211; <a href="http://aws.amazon.com/elasticmapreduce/">anyone with a credit card can spin up a compute cluster</a>. We should not be afraid to change our tools as our challenges change.</p>
<p>Technology is continuously evolving. This means our tools are continuously changing and so must our processes for tackling new challenges.  I believe that the system we came up with in that meeting will be the one to solve our client’s problem. If someone gave us the same problem five years ago or five years from now we would probably have wildly different suggestions, but we would come to those suggestions in the same way: through deep understanding of both the problem and the technology available.</p>

<div class="jwsharethis">
Share this: 
<a target="_blank" href="http://digg.com/submit?url=http%3A%2F%2Fblog.controlgroup.com%2F2011%2F10%2F11%2Fsolving-big-data-problems-requires-big-solutions%2F&amp;title=Big+Data%3A+SQL+Planning+%26%23038%3B+Migration+to+Spark+and+Hadoop">
<img src="/wp-content/themes/journalist/images/share_icons/digg.png" alt="Share this page via Digg this" />
</a>
<a target="_blank" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fblog.controlgroup.com%2F2011%2F10%2F11%2Fsolving-big-data-problems-requires-big-solutions%2F&amp;t=Big+Data%3A+SQL+Planning+%26%23038%3B+Migration+to+Spark+and+Hadoop">
<img src="/wp-content/themes/journalist/images/share_icons/facebook.png" alt="Share this page via Facebook" />
</a>
<a target="_blank" href="http://twitter.com/home?status=I+like+http%3A%2F%2Fblog.controlgroup.com%2F2011%2F10%2F11%2Fsolving-big-data-problems-requires-big-solutions%2F&amp;title=Big+Data%3A+SQL+Planning+%26%23038%3B+Migration+to+Spark+and+Hadoop">
<img src="/wp-content/themes/journalist/images/share_icons/twitter.png" alt="Share this page via Twitter" />
</a>
<a target="_blank" href="http://www.linkedin.com/shareArticle?mini=true&url=http%3A%2F%2Fblog.controlgroup.com%2F2011%2F10%2F11%2Fsolving-big-data-problems-requires-big-solutions%2F&title=Big+Data%3A+SQL+Planning+%26%23038%3B+Migration+to+Spark+and+Hadoop&source=Control+Group+Blog">
<img src="/wp-content/themes/journalist/images/share_icons/linkedin.png" alt="Share this with Linked in" />
</a>
</div>]]></content:encoded>
			<wfw:commentRss>http://blog.controlgroup.com/2011/10/11/solving-big-data-problems-requires-big-solutions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Enterprise Clients Continue To Warm To The Cloud</title>
		<link>http://blog.controlgroup.com/2011/03/23/enterprise-clients-continue-to-warm-to-the-cloud/</link>
		<comments>http://blog.controlgroup.com/2011/03/23/enterprise-clients-continue-to-warm-to-the-cloud/#comments</comments>
		<pubDate>Wed, 23 Mar 2011 21:25:26 +0000</pubDate>
		<dc:creator>Stephen Croll</dc:creator>
				<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[enterprise]]></category>

		<guid isPermaLink="false">http://blog.controlgroup.com/?p=1523</guid>
		<description><![CDATA[Lately we&#8217;ve been working with clients that haven&#8217;t been the typical EC2 infrastructure consumer. Historically, it has been the startup companies that we work with that have been interested in AWS for all the expected reasons: flexibility, pay-for-what-you-need, access to higher end services like load balancing and HA database deployments, etc. Recently we have been noticing that our [...]]]></description>
			<content:encoded><![CDATA[<p>Lately we&#8217;ve been working with clients that haven&#8217;t been the typical <a href="http://aws.amazon.com/ec2/">EC2</a> infrastructure consumer. Historically, it has been the startup companies that we work with that have been interested in <a href="http://aws.amazon.com/">AWS</a> for all the expected reasons: flexibility, pay-for-what-you-need, access to higher end services like load balancing and HA database deployments, etc. Recently we have been noticing that our more established enterprise clients have taken interest in these capabilities and for largely the same reasons.</p>
<p>Large enterprises looking at cloud infrastructure bring their own requirements and challenges. We plan to write a series of blog posts about Control Group&#8217;s experiences with these types of clients and what we learned. Some of the posts will be about the projects and their politics, and some will be about technology approach. There are some interesting technology and organizational challenges that we will discuss, so stay tuned.</p>

<div class="jwsharethis">
Share this: 
<a target="_blank" href="http://digg.com/submit?url=http%3A%2F%2Fblog.controlgroup.com%2F2011%2F03%2F23%2Fenterprise-clients-continue-to-warm-to-the-cloud%2F&amp;title=Enterprise+Clients+Continue+To+Warm+To+The+Cloud">
<img src="/wp-content/themes/journalist/images/share_icons/digg.png" alt="Share this page via Digg this" />
</a>
<a target="_blank" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fblog.controlgroup.com%2F2011%2F03%2F23%2Fenterprise-clients-continue-to-warm-to-the-cloud%2F&amp;t=Enterprise+Clients+Continue+To+Warm+To+The+Cloud">
<img src="/wp-content/themes/journalist/images/share_icons/facebook.png" alt="Share this page via Facebook" />
</a>
<a target="_blank" href="http://twitter.com/home?status=I+like+http%3A%2F%2Fblog.controlgroup.com%2F2011%2F03%2F23%2Fenterprise-clients-continue-to-warm-to-the-cloud%2F&amp;title=Enterprise+Clients+Continue+To+Warm+To+The+Cloud">
<img src="/wp-content/themes/journalist/images/share_icons/twitter.png" alt="Share this page via Twitter" />
</a>
<a target="_blank" href="http://www.linkedin.com/shareArticle?mini=true&url=http%3A%2F%2Fblog.controlgroup.com%2F2011%2F03%2F23%2Fenterprise-clients-continue-to-warm-to-the-cloud%2F&title=Enterprise+Clients+Continue+To+Warm+To+The+Cloud&source=Control+Group+Blog">
<img src="/wp-content/themes/journalist/images/share_icons/linkedin.png" alt="Share this with Linked in" />
</a>
</div>]]></content:encoded>
			<wfw:commentRss>http://blog.controlgroup.com/2011/03/23/enterprise-clients-continue-to-warm-to-the-cloud/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Testing Storage Performance with iozone</title>
		<link>http://blog.controlgroup.com/2009/08/03/testing-storage-performance-with-iozone/</link>
		<comments>http://blog.controlgroup.com/2009/08/03/testing-storage-performance-with-iozone/#comments</comments>
		<pubDate>Mon, 03 Aug 2009 19:28:45 +0000</pubDate>
		<dc:creator>David Rocamora</dc:creator>
				<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[engineering]]></category>
		<category><![CDATA[enterprise]]></category>
		<category><![CDATA[integration]]></category>
		<category><![CDATA[SAN]]></category>
		<category><![CDATA[server]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://blog.controlgroup.com/?p=416</guid>
		<description><![CDATA[As I&#8217;ve mentioned in previous posts about testing storage performance with lmdd and bonnie++, different applications require different characteristics from storage to provide the best performance. I&#8217;ve highlighted some tests that are good for large streaming files like video, and small file transactions like databases or mail servers. Today I want to look at a [...]]]></description>
			<content:encoded><![CDATA[<p>As I&#8217;ve mentioned in previous posts about testing storage performance with <a href="http://blog.controlgroup.com/2009/06/08/testing-storage-performance-for-video-with-lmdd/">lmdd</a> and <a href="http://blog.controlgroup.com/2009/07/21/testing-storage-performance-with-bonnie/">bonnie++</a>, different applications require different characteristics from storage to provide the best performance. I&#8217;ve highlighted some tests that are good for large streaming files like video, and small file transactions like databases or mail servers. Today I want to look at a tool that runs a series of tests in many different ways to provide you with a holistic view of what the storage can and can&#8217;t do.</p>
<p>This tool is called <a href="http://www.iozone.org">iozone</a>. iozone is open source and runs on a ton of operating systems (including Windows). It runs several tests which can take some time to complete but provide the best overall view of the capabilities of a piece of storage. For instance, iozone runs a write test with files of different sizes and with different size records (the amount of data written at a time). It does this over and over again with writes, reads, random writes, random reads, and so forth. Since it&#8217;s running all these tests you can see what sorts of operations will have good performance and which ones will not perform so well. Check out the <a href="http://www.iozone.org/docs/IOzone_msword_98.pdf">iozone documentation here</a>.</p>
<p>One really great thing about iozone is that the output it generates can be easily placed in a spreadsheet program like Excel to generate a great 3d diagram describing your storage. Here&#8217;s a diagram I generated from some tests on a Linux server.</p>
<div id="attachment_418" class="wp-caption alignnone" style="width: 460px"><img class="size-full wp-image-418" title="Results of a write test with iozone" src="http://controlgroupblog.files.wordpress.com/2009/07/iozone_write.png" alt="Results of a write test with iozone" width="450" height="296" /><p class="wp-caption-text">Results of a write test with iozone</p></div>
<p>This particular server performed quite well with large files and a record size around 1 MB (interesting to note, this is the same storage from the <a href="http://blog.controlgroup.com/2009/06/08/testing-storage-performance-for-video-with-lmdd/">lmdd post</a>. Notice that the parameters I tested with there are the same as the best write that this disk can do according to iozone!).</p>
<p>If you&#8217;ve been following my posts on storage performance testing I hope you&#8217;ve learned about some new tools that you can use to see what&#8217;s going on. I use these on every deployment to make sure we&#8217;re giving our clients solutions that they can depend for performance and reliability. As always, let me know if you have any questions about these tools. Happy testing!</p>

<div class="jwsharethis">
Share this: 
<a target="_blank" href="http://digg.com/submit?url=http%3A%2F%2Fblog.controlgroup.com%2F2009%2F08%2F03%2Ftesting-storage-performance-with-iozone%2F&amp;title=Testing+Storage+Performance+with+iozone">
<img src="/wp-content/themes/journalist/images/share_icons/digg.png" alt="Share this page via Digg this" />
</a>
<a target="_blank" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fblog.controlgroup.com%2F2009%2F08%2F03%2Ftesting-storage-performance-with-iozone%2F&amp;t=Testing+Storage+Performance+with+iozone">
<img src="/wp-content/themes/journalist/images/share_icons/facebook.png" alt="Share this page via Facebook" />
</a>
<a target="_blank" href="http://twitter.com/home?status=I+like+http%3A%2F%2Fblog.controlgroup.com%2F2009%2F08%2F03%2Ftesting-storage-performance-with-iozone%2F&amp;title=Testing+Storage+Performance+with+iozone">
<img src="/wp-content/themes/journalist/images/share_icons/twitter.png" alt="Share this page via Twitter" />
</a>
<a target="_blank" href="http://www.linkedin.com/shareArticle?mini=true&url=http%3A%2F%2Fblog.controlgroup.com%2F2009%2F08%2F03%2Ftesting-storage-performance-with-iozone%2F&title=Testing+Storage+Performance+with+iozone&source=Control+Group+Blog">
<img src="/wp-content/themes/journalist/images/share_icons/linkedin.png" alt="Share this with Linked in" />
</a>
</div>]]></content:encoded>
			<wfw:commentRss>http://blog.controlgroup.com/2009/08/03/testing-storage-performance-with-iozone/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Testing Storage Performance with bonnie++</title>
		<link>http://blog.controlgroup.com/2009/07/21/testing-storage-performance-with-bonnie/</link>
		<comments>http://blog.controlgroup.com/2009/07/21/testing-storage-performance-with-bonnie/#comments</comments>
		<pubDate>Tue, 21 Jul 2009 15:07:35 +0000</pubDate>
		<dc:creator>David Rocamora</dc:creator>
				<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[engineering]]></category>
		<category><![CDATA[enterprise]]></category>
		<category><![CDATA[integration]]></category>
		<category><![CDATA[SAN]]></category>
		<category><![CDATA[server]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://blog.controlgroup.com/?p=333</guid>
		<description><![CDATA[Last time I posted about checking disk performance with lmdd. lmdd is great for checking streaming throughput, but what if you have a different kind of application? Every application accesses storage in different ways: with video we need to be able to provide constant throughput when writing a lot of data to the disk, but [...]]]></description>
			<content:encoded><![CDATA[<p>Last time I posted about <a href="http://blog.controlgroup.com/2009/06/08/testing-storage-performance-for-video-with-lmdd/">checking disk performance with lmdd</a>. lmdd is great for checking streaming throughput, but what if you have a different kind of application? Every application accesses storage in different ways: with video we need to be able to provide constant throughput when writing a lot of data to the disk, but other applications may have different storage needs. For example, a database can make lots of very small changes to the data on disk in a short period of time. The best performing disk for a database will probably need to have very low seek time and good transactional performance.</p>
<p><a href="http://www.coker.com.au/bonnie++/">bonnie++</a> is a series of file system tests that focuses on small files. It was designed to behave like a mail server does, creating and dealing with lots of small files (emails). bonnie++ is easy to run and outputs a CSV file that you can view with something like Excel. With the bon_csv2html command you can quickly generate html pages from the CSVs.</p>
<p>Here&#8217;s the output from bonnie++ running on a server:</p>
<div id="attachment_409" class="wp-caption alignnone" style="width: 460px"><img class="size-full wp-image-409" title="bonnie++ Output" src="http://controlgroupblog.files.wordpress.com/2009/07/bonnie_xx_output.png" alt="The HTML output of bonnie++ on a Linux Server" width="450" height="145" /><p class="wp-caption-text">The HTML output of bonnie++ on a Linux Server</p></div>
<p>At first glance the output can seem quite cryptic, but if we look close we can see that this provides us a great amount of information about latency and speed on different filesystem operations. I generally run this several times as I make changes to verify that the storage is providing the right performance characteristics. Tweaking a file system to make file system operations happen a few milliseconds faster may seem ridiculous, but in some environments it can make a huge difference.</p>
<p>Next time I&#8217;ll post about a tool that&#8217;s new to me but can test a disk in so many different ways I&#8217;m planning to run it on every system we install from now on.</p>

<div class="jwsharethis">
Share this: 
<a target="_blank" href="http://digg.com/submit?url=http%3A%2F%2Fblog.controlgroup.com%2F2009%2F07%2F21%2Ftesting-storage-performance-with-bonnie%2F&amp;title=Testing+Storage+Performance+with+bonnie%2B%2B">
<img src="/wp-content/themes/journalist/images/share_icons/digg.png" alt="Share this page via Digg this" />
</a>
<a target="_blank" href="http://www.facebook.com/sharer.php?u=http%3A%2F%2Fblog.controlgroup.com%2F2009%2F07%2F21%2Ftesting-storage-performance-with-bonnie%2F&amp;t=Testing+Storage+Performance+with+bonnie%2B%2B">
<img src="/wp-content/themes/journalist/images/share_icons/facebook.png" alt="Share this page via Facebook" />
</a>
<a target="_blank" href="http://twitter.com/home?status=I+like+http%3A%2F%2Fblog.controlgroup.com%2F2009%2F07%2F21%2Ftesting-storage-performance-with-bonnie%2F&amp;title=Testing+Storage+Performance+with+bonnie%2B%2B">
<img src="/wp-content/themes/journalist/images/share_icons/twitter.png" alt="Share this page via Twitter" />
</a>
<a target="_blank" href="http://www.linkedin.com/shareArticle?mini=true&url=http%3A%2F%2Fblog.controlgroup.com%2F2009%2F07%2F21%2Ftesting-storage-performance-with-bonnie%2F&title=Testing+Storage+Performance+with+bonnie%2B%2B&source=Control+Group+Blog">
<img src="/wp-content/themes/journalist/images/share_icons/linkedin.png" alt="Share this with Linked in" />
</a>
</div>]]></content:encoded>
			<wfw:commentRss>http://blog.controlgroup.com/2009/07/21/testing-storage-performance-with-bonnie/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

