Whilst working on an upcoming post I realised that I needed a decent data set on which to test some of the new visualisations in SSRS 2016. I remembered this post on open data sets by Kendra Little that mentioned Data.gov. There are a lot of different data sets here but I settled on a set describing social media usage by organisations in New York.
After downloading my data as a .CSV file I created a database to store it in.
CREATE DATABASE [OpenDatasets];
After that right click the database and select Tasks and then Import Data…
Importing the Data
Click next past the welcome screen and then select flat file source from the Data source drop down. Browse to your file and check the columns have been discovered correctly. If your data uses a different row or column delimiter then you can alter those on the columns section. A text qualifier can be set on the general tab and data types can be set per column for each column on the Advanced tab.
Here we’ll set Date Sampled to a Date and the last column to an INT. We’ll also increase Url to 250 as I know (from looking through the file) some of those values will be larger than the default 50. Check it looks good in the preview and click Next.
On the Destination Tab scroll to the bottom of the Destination drop down and select SQL Server Native Client. If you have multiple local SQL instances or are importing to a remote instance your will need to select the instance from the Server name drop down.
Click Next and use Edit Mappings if inserting into an existing table.
Leave Run Immediately ticked and click Finish (twice)
If you get all Successes check your data in SSMS. You may want to start adding indexes at this point but this is out of scope for this post. So to finish I’m going to run the following query against my new data to gain some insight into it.
SELECT Agency ,[Platform] ,SUM([Likes Followers Visits Downloads]) AS Likes FROM [OpenDatasets].[dbo].[NYC_Social_Media_Usage] GROUP BY Agency, [Platform] ORDER BY [Likes] DESC
Ignoring the Total values at the top (I will remove this row from my data later) it’s obvious to see that NYC Gov’s Newsletter is the most ‘Liked’ item.
The Import Export Wizard is a fairly versatile tool for getting data into SQL Server but, with a larger data set, it may make more sense to use the Bulk Copy Program (BCP) utility to import the data.
This was a very simple process to get a real world data set into SQL Server. It actually took me longer to select a data set from Data.gov than it did to download and import it.