Importing Open Data Sets in SQL Server

Whilst working on an upcoming post I realised that I needed a decent data set on which to test some of the new visualisations in SSRS 2016. I remembered this post on open data sets by Kendra Little that mentioned Data.gov. There are a lot of different data sets here but I settled on a set describing social media usage by organisations in New York.

Open Dataset NYC Social Media Use
Make sure you read the license information.

After downloading my data as a .CSV file I created a database to store it in.

CREATE DATABASE [OpenDatasets];

After that right click the database and select Tasks and then Import Data…

Importing the Data

Click next past the welcome screen and then select flat file source from the Data source drop down. Browse to your file and check the columns have been discovered correctly. If your data uses a different row or column delimiter then you can alter those on the columns section. A text qualifier can be set on the general tab and data types can be set per column for each column on the Advanced tab.

Setting a text qualifier when importing data into SQL Server

Alter OutputColumnWidth to allow for text fields larger than 50

Here we’ll set Date Sampled to a Date and the last column to an INT. We’ll also increase Url to 250 as I know (from looking through the file) some of those values will be larger than the default 50. Check it looks good in the preview and click Next.

On the Destination Tab scroll to the bottom of the Destination drop down and select SQL Server Native Client. If you have multiple local SQL instances or are importing to a remote instance your will need to select the instance from the Server name drop down.

Selecting SQL Server Native client when importing data

Click Next and use Edit Mappings if inserting into an existing table.

Leave Run Immediately ticked and click Finish (twice)

Data has been imported in SQL Server from a .CSV file

If you get all Successes check your data in SSMS. You may want to start adding indexes at this point but this is out of scope for this post. So to finish I’m going to run the following query against my new data to gain some insight into it.

SELECT Agency
,[Platform]
,SUM([Likes Followers Visits Downloads]) AS Likes

FROM [OpenDatasets].[dbo].[NYC_Social_Media_Usage]

GROUP BY Agency, [Platform]

ORDER BY [Likes] DESC

Ignoring the Total values at the top (I will remove this row from my data later) it’s obvious to see that NYC Gov’s Newsletter is the most ‘Liked’ item.

Query results from an open data set query

The Import Export Wizard is a fairly versatile tool for getting data into SQL Server but, with a larger data set, it may make more sense to use the Bulk Copy Program (BCP) utility to import the data.

This was a very simple process to get a real world data set into SQL Server. It actually took me longer to select a data set from Data.gov than it did to download and import it.

About the author

James Anderson

I'm a freelance SQL Server DBA in Southampton, UK. Performance tuning code and improving development processes are what I love to do for work. Outside of work, I live to travel to as many places as possible with my wife Sarah.

Leave a Reply

Your email address will not be published. Required fields are marked *

Sign Up

James Anderson SQL Server DBA

Hi, I'm James Anderson. When I'm not blogging about SQL Server, I do this.

The SQL People

DBA Stackexchange Profile