Connecting R to Data Sources (ODBC)
Toggle Code Viewing:
ON
Import data directly from data sources such as Oracle, MySQL, Microsoft SQL Server, Microsoft Access, Microsoft Excel, and more.
In this example, I will connect R to an Oracle database. Some of the steps below will only need to be completed once. Others, as indicated, must be performed each time you open R.
Step 1: Configuring the ODBC Driver on your computer.
1st Time Only: From the Start Menu on your computer:- Control Panel
- Administrative Tools
- Data Sources (ODBC)
- Select "Add"
- Scroll Down to Driver Name (in my example OraClient10g_home1)
- Data Source Name: Choose name
- Specify Data Source Configuration
- (Specific to Oracle) TNS Service Names: Select from dropdown
- (Specific to Oracle) User ID - login hint: if you connect to your database through Toad, the login will be identical
Step 2: Creating the connection between R and ODBC Driver.
1st Time Only: In the R Console:install.packages("RODBC")
- Load the RODBC package; type:
library(RODBC)
- Create the connection; type:
(opens a Select Data Source window with two tabs)
con <- odbcDriverConnect()
- Select "Machine Data Source" tab
- Click on desired Data Source Name and press enter
- Log in with User ID and Password
Step 3: Working with Data in R
Example Oracle, Part 1: SQL Query to Import data set into RIn the R Console:
- Query the data from the database table
data <- sqlQuery(con, "select * from TABLE_NAME where COLUMN_NAME = 'Awesome';")
Note that within the queries, you must always use single quotes.
Example Oracle, Part 2: SQL Query to create a table in your databaseIn the R Console:
- Select/create your data frame (table) that you want to upload as a table in the database. For help on data frames and creating tables in your database:
help(data.frame) help(sqlSave)
- In this example, suppose the data frame is named "budget".
sqlSave(con, budget, tablename = "BUDGET_2010", colnames = FALSE, rownames = FALSE)
Now you have created a table in your database entitled "BUDGET_2010". Set the column/row names to TRUE or FALSE depending on your data frame.
Example Excel: Import an Excel file worksheet into R using ODBCIn the R Console:
- Type:
(opens up Select File window)
conexcel <- odbcConnectExcel2007(choose.files())
- Click on desired Excel File and press enter
- Now you can fetch data from a specific worksheet within the file, such as the tab "2010 Budget":
budget <- sqlFetch(conexcel, "2010 Budget")
- Additionally, suppose you want to select only rows 1 - 71 and columns 1-5 from budget
budget <- budget[1:71, 1:5]
- Finally, you can close the ODBC connection if desired. Alternatively, it will close when you exit R.
odbcClose(conexcel)
- Create a directory in your hard drive that does not have spaces in the file path, such as
"C:\mydata" - Save Excel spreadsheet in ".csv" or other appropriate format in your newly created directory
- For help on reading files into R, type in the R Console
help(read.table) help(read.csv)
- Read file into R, replacing back slashes with forward slashes. Additionally, you can specify rows (1-65) and columns (2-7).
budget2 <- data.frame(read.csv("C:/mydata/2010_Budget_Example.csv", header = T, dec = ".", sep = ",")[1:65, 2:7])
blog comments powered by Disqus