AndeeKaplan

Assistant Professor at Colorado State University Statistics

Connecting R to Data Sources (ODBC)

Toggle Code Viewing: ON

Import data directly from data sources such as Oracle, MySQL, Microsoft SQL Server, Microsoft Access, Microsoft Excel, and more.

ODBC Diagram

In this example, I will connect R to an Oracle database. Some of the steps below will only need to be completed once. Others, as indicated, must be performed each time you open R.

Step 1: Configuring the ODBC Driver on your computer.

1st Time Only: From the Start Menu on your computer:
  1. Control Panel
  2. Administrative Tools
  3. Data Sources (ODBC)
  4. Select "Add"
  5. Scroll Down to Driver Name (in my example OraClient10g_home1)
  6. Data Source Name: Choose name
  7. Specify Data Source Configuration
    • (Specific to Oracle) TNS Service Names: Select from dropdown
    • (Specific to Oracle) User ID - login hint: if you connect to your database through Toad, the login will be identical

Step 2: Creating the connection between R and ODBC Driver.

1st Time Only: In the R Console:
install.packages("RODBC")
Each Time You Open R: In the R Console:
  1. Load the RODBC package; type:
    library(RODBC)
    
  2. Create the connection; type:
    con <- odbcDriverConnect()
    
    (opens a Select Data Source window with two tabs)
  3. Select "Machine Data Source" tab
  4. Click on desired Data Source Name and press enter
  5. Log in with User ID and Password

Step 3: Working with Data in R

Example Oracle, Part 1: SQL Query to Import data set into R

In the R Console:

  1. Query the data from the database table
    data <- sqlQuery(con, "select * from TABLE_NAME where COLUMN_NAME = 'Awesome';")
    

Note that within the queries, you must always use single quotes.

Example Oracle, Part 2: SQL Query to create a table in your database

In the R Console:

  1. Select/create your data frame (table) that you want to upload as a table in the database. For help on data frames and creating tables in your database:
    help(data.frame)
    help(sqlSave)
    
  2. In this example, suppose the data frame is named "budget".
    sqlSave(con, budget, tablename = "BUDGET_2010", colnames = FALSE, rownames = FALSE)
    

Now you have created a table in your database entitled "BUDGET_2010". Set the column/row names to TRUE or FALSE depending on your data frame.

Example Excel: Import an Excel file worksheet into R using ODBC

In the R Console:

  1. Type:
    conexcel <- odbcConnectExcel2007(choose.files())
    
    (opens up Select File window)
  2. Click on desired Excel File and press enter
  3. Now you can fetch data from a specific worksheet within the file, such as the tab "2010 Budget":
    budget <- sqlFetch(conexcel, "2010 Budget")
    
  4. Additionally, suppose you want to select only rows 1 - 71 and columns 1-5 from budget
    budget <- budget[1:71, 1:5]
    
  5. Finally, you can close the ODBC connection if desired. Alternatively, it will close when you exit R.
    odbcClose(conexcel)
    
Example Excel, Alternate: Read Excel spreadsheet into R
  1. Create a directory in your hard drive that does not have spaces in the file path, such as
    "C:\mydata"
  2. Save Excel spreadsheet in ".csv" or other appropriate format in your newly created directory
  3. For help on reading files into R, type in the R Console
    help(read.table)
    help(read.csv)
    
  4. Read file into R, replacing back slashes with forward slashes. Additionally, you can specify rows (1-65) and columns (2-7).
    budget2 <- data.frame(read.csv("C:/mydata/2010_Budget_Example.csv", header = T, 
        dec = ".", sep = ",")[1:65, 2:7])
    



    blog comments powered by Disqus