Imported Content

The Story So Far Brent Ozar (blog | twitter) asked me to pick a favorite blog post for the year. Since I couldn’t pick anything I wrote (yes, I love myself that much), I had to pick one from the community. Since just about everyone in Brent’s crazy list of crazy blogs about SQL, I had to pick someone from the SQL Server community. My Favorite Blog Post This Year Earlier this year, Mladen Prajdić posted SQL Server – Undelete a Table and Restore a Single Table from Backup.

Inside of us is a playground – It’s a video, but there are words in it so it kind of counts as reading. MySQL vs. PostgreSQL, Part 1: Table Organization – Robert Haas gets paid to write PostgreSQL. He’s started up a series of posts looking at the differences between PostgreSQL and MySQL. It’s interesting to look at how different databases implement various features. On this one, SQL Server does things the same way as MySQL.

This is a follow up to the last two posts I made about querying HBase and Hive. The set up for this was a bit trickier than I would have liked, so I’m documenting my entire process for three reasons. To remind myself for the next time I have to do this. To help someone else get started. In the hope that someone will know a better way and help me improve this.

I recently talked about using the Toad for Cloud Databases Eclipse plug-in toquery an HBase database. After I finished up the video, I did some work loading a sample dataset from Retrosheet into my local Hive instance. This 7 minute tutorial shows you brand new functionality in the Toad for Cloud Databases Eclipse plug-in and how you can use it to perform data warehousing queries against Hive. https://www.youtube.com/watch?v=iWzvBWd9K-M

It’s my birthday today, so I’m going to subject you to more things that I’ve been reading recently. William Gibson says the future is right here, right now – I couldn’t agree with this more. Time gets faster because we’re constantly connected to the world. The Humble Pencil, The Mighty Computer – I love using pencil and paper to work through things. In fact, I don’t understand people who can draw, sketch, or brainstorm straight into a piece of software.

When I first started blogging, I was nervous because I might be wrong. Not just “wrong” like “Gee Ted, I don’t know if I agree with your opinion about using Irish children as forced labor,” but factually wrong. And if I was wrong on the internet, people would see it forever and ever. Well, Google would know about it forever and ever. A friend of mine advised me that I would certainly be wrong on the internet, just like I’ve been fantastically wrong in real life, and that the only thing you can do is get up, admit you were wrong, and keep going again.

Update: I want to thank Ben Black, Todd Lipcon, and Kelley Reynolds for pointing out the inaccuracies in the original post. I’ve gone through the references they provided and made corrections. Facebook, the data giant of the internet, recently unveiled a new messaging system. To be fair, I would normally ignore feature roll outs and marketing flimflam, but Facebook’s announcement is worthy of attention not only because of the underlying database but how the move was done.

Understanding checkpoint_completion_target Ever wonder what’s REALLY going on under the hood of my favorite database? I do. Hubert Depesz writing series of articles about PostgreSQL configuration parameters. This one goes into incredible detail about how PostgreSQL handles checkpointing and keeps the database performing at high speeds. Riak SmartMachine Benchmark: The Technical Details How much data can you push for not a lot of money? It turns out that the answer is “A lot of data.

We all know that you can use NoSQL databases to store data. And that’s cool, right? After all, NoSQL databases can be massively distributed, areredundant, and really, really fast. But some of the things that make NoSQL database really interesting aren’t just the redundancy, performance, or their ability to use all of those old servers in the closet. Under the covers, NoSQL databases are supported by complex code that makes these features possible – things like distributed file systems.

MongoDB has replication built in. So does SQL Server, Oracle, DB2, PostgreSQL, and MySQL. What’s the difference? What makes each MongoDB a unique and special snowflake? I recently read a three part series on MongoDB repication (Replication Internals, Getting to Know Your Oplog, Bending the Oplog to Your Will) in an effort to better understand MongoDB’s replication compared to SQL Server’s replication. Logging Sidebar Before we get started, it’s important to distinguish between the oplog and MongoDB’s regular log.

Imported Content

Twelve Days of SQL – Day (2 – 1)

What I’m Reading – 2010-12-03

Getting Started With Hive

Querying Hive with Toad for Cloud Databases

What I’m Reading – 2010 Birthday Edition

On Being Wrong

Facebook Messaging - HBase Comes of Age

What I'm Reading 2010-11-15

New Uses for NoSQL

Comparing MongoDB and SQL Server Replication