Importing Large Files In To Excel
This page describes VBA code to import large text files into Excel.
Excel 2003 and earlier versions are limited to 65,536 rows of data. You cannot
increase this limit. Therefore, if you attempt to import a very large text file,
only the first 64K rows will be imported (or fewer if the imported data doesn't
start at row 1). Excel's built in import and open functions will only import as
much data as will fit on the active worksheet. It will not continue the import
operation on subsequent worksheets. This page describes a procedure named
ImportBigFile that will import a text file with any
number of records. The procedure will create additional worksheets as required.
It will optionally split each line of input data into separate columns,
delimited by any specified character. The code isn't only for files with more
than 64K rows -- it can be used to import a text file of any size. The code
requires Excel 2000 or later.
|
The input procedures used in the ImportBigTextFile procedure
recognize only vbCr and vbCrLf characters
(ASCII 13) as line breaks. If your text file uses vbLf (ASCII 10) characters
as line breaks, the import operation will not work properly. You will need to convert the
vbLf characters to vbCr characters. The
ConvertFileLF function at the end of this page will convert a file
from vbLf line breaks to vbCr characters
and save the modifications to a new file.
|
You are not restricted to filling an entire worksheet before moving on to a
subsequent worksheet. You can set the MaxRowsPerSheet
variable to the number of records that should appear on each sheet. This will,
for example, allow only 2,000 records per sheet. Alternatively, you can set the
LastRowForImport variable to the last row to which
data will be written on each sheet. Typically, you will use only one of these
two value, setting the other one to 0. The difference between the two values are
that MaxRowsPerSheet indicates the total number of
records per sheet, regardless of row numbers, while
LastRowForImport indicates the last row number to be used on each sheet.
The data import begins on the row specified by
C_START_ROW_FIRST_PAGE on the first worksheet and begin in the row
specified by C_START_ROW_LATER_PAGES on the second and
subsequent worksheets. These constants must be between 1 and 65536.
The data is always imported in to the active workbook, which may not be the same
workbook that contains the code.
DEFINITION: The ActiveWorkbook is the workbook that is presently
displayed in the main Excel window or the window that has focus if
there are multiple windows open. ThisWorkbook always refers to the
workbook that contains the code, regardless of what workbook may be active.
The function version returns as its result the total number of lines imported,
or -1 if an error occurred. If an error occurred, the error number is placed in
the ErrorNumber parameter variable and a text
description of the error is placed in the ErrorMessage
parameter. The symbolic constants for error numbers are listed at the top of the
code module.
If a new worksheet needs to be created to store the imported data, you may
specify a worksheet to be used as a template by setting the
C_TEMPLATE_SHEET_NAME value to the name of the
template sheet. This template sheet must exist in the active workbook. You can
specify how the created sheets should be named by setting the
C_SHEET_NAME_PREFIX value to the text prefix to be used as the sheet
name. A number, starting with 1 and incrementing for each created worksheet,
will be concatenated with C_SHEET_NAME_PREFIX to
create the sheet name.
The only restrictions on the number of records that may be imported are the
amount of memory available to Excel on your machine and the upper limit of a
Long Integer data type (2,147,483,647). If you are concerned about overflowing
the long, don't fret. That would be about 32,000 worksheets and Excel will die
long before that.
During the import process, a new sheet will be created when any one or
more of the following conditions are true:
- The current import row is greater than Rows.Count (65,536).
- The current import row is greater than LastRowForInput
- The number of imported rows on the sheet is greater than the value of
MaxRowsPerSheet
Complete documentation is provided in the in the
downloadable bas module file and the
downloadable function module.
Option Explicit
Option Compare Text
Sub ImportBigTextFile()
'''''''''''''''''''
Const C_START_ROW_FIRST_PAGE = 3
Const C_START_ROW_LATER_PAGES = 2
Const C_START_SHEET_NAME = "Sheet1"
Const C_START_COLUMN = 4
Const C_SHEET_NAME_PREFIX = "DataImport"
Const C_TEMPLATE_SHEET_NAME = vbNullString
Const C_UPDATE_STATUSBAR_EVERY_N_RECORDS = 1000
Const C_STATUSBAR_TEXT = "Processing Record: "
Dim RowNdx As Long Dim Colndx As Long Dim FName As Variant Dim FNum As Integer Dim WS As Worksheet Dim InputLine As String Dim Arr As Variant Dim SplitChar As String Dim SheetNumber As Long Dim SaveCalc As XlCalculation Dim SaveScreenUpdating As Boolean Dim SaveDisplayAlerts As Boolean Dim SaveEnableEvents As Boolean Dim InputCounter As Long Dim LastRowForInput As Long Dim MaxRowsPerSheet As Long Dim RowsThisSheet As Long Dim TruncatedCount As Long
SheetNumber = 1
If Application.ActiveWorkbook Is Nothing Then
MsgBox "There is no active workbook."
Exit Sub
End If
'SplitChar = ","
MaxRowsPerSheet = 0&
LastRowForInput = ActiveWorkbook.Worksheets(1).Rows.Count
If (Len(C_SHEET_NAME_PREFIX) < 1) Or (Len(C_SHEET_NAME_PREFIX) > 29) Then
MsgBox "The value of C_SHEET_NAME_PREFIX must have between 1 and 29 characters." & vbCrLf & _
"The current length of C_SHEET_NAME_PREFIX is " & CStr(Len(C_SHEET_NAME_PREFIX)) & " characters."
Exit Sub
End If
On Error Resume Next
Err.Clear
Set WS = ActiveWorkbook.Worksheets(C_START_SHEET_NAME)
If Err.Number <> 0 Then
MsgBox "The sheet named in C_START_SHEET_NAME (" & C_START_SHEET_NAME & ") does not exist" & vbCrLf & _
"or is not a worksheet (e.g., it is a chart sheet).", vbOKOnly
Exit Sub
End If
On Error Resume Next
Err.Clear
If C_TEMPLATE_SHEET_NAME <> vbNullString Then
Set WS = ActiveWorkbook.Worksheets(C_TEMPLATE_SHEET_NAME)
If Err.Number <> 0 Then
MsgBox "The template sheet '" & C_TEMPLATE_SHEET_NAME & "' does not exist or is not a worksheet."
Exit Sub
End If
If C_TEMPLATE_SHEET_NAME = C_START_SHEET_NAME Then
MsgBox "The C_TEMPLATE_SHEET_NAME is equal to the C_START_SHEET_NAME." & vbCrLf & _
"This is not allowed."
Exit Sub
End If
End If
On Error GoTo 0
Set WS = ActiveWorkbook.Worksheets(C_START_SHEET_NAME)
If WS.ProtectContents = True Then
MsgBox "The worksheet '" & WS.Name & "' is protected."
Exit Sub
End If
If C_TEMPLATE_SHEET_NAME <> vbNullString Then
If ActiveWorkbook.Worksheets(C_TEMPLATE_SHEET_NAME).ProtectContents = True Then
MsgBox "The Template Sheet (" & C_TEMPLATE_SHEET_NAME & ") is protected."
Exit Sub
End If
End If
If ActiveWorkbook.ProtectStructure = True Then
MsgBox "The ActiveWorkbook is protected."
Exit Sub
End If
FName = Application.GetOpenFilename(FileFilter:="Text Files (*.txt),*.txt," & _
"CSV Files (*.csv),*.csv")
If FName = False Then
' user clicked CANCEL. get out now.
Exit Sub
End If
On Error Resume Next
Set WS = ActiveWorkbook.Worksheets(C_START_SHEET_NAME)
If WS Is Nothing Then
MsgBox "The worksheet specified in C_START_SHEET_NAME (" & _
C_START_SHEET_NAME & ") does not exist."
Exit Sub
End If
On Error GoTo 0
If IsFileOpen(FileName:=CVar(FName)) = True Then
MsgBox "The file '" & FName & "' is open by another process."
Exit Sub
End If
SaveCalc = Application.Calculation
SaveDisplayAlerts = Application.DisplayAlerts
SaveScreenUpdating = Application.ScreenUpdating
SaveEnableEvents = Application.EnableEvents
Application.DisplayAlerts = False
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
Application.EnableEvents = False
On Error Resume Next
FNum = FreeFile
Err.Clear
Open FName For Input Access Read As #FNum
If Err.Number <> 0 Then
MsgBox "An error occurred opening file '" & FName & "'." & vbCrLf & _
"Error Number: " & CStr(Err.Number) & vbCrLf & _
"Description: " & Err.Description
Close #FNum
Application.Calculation = SaveCalc
Application.ScreenUpdating = SaveScreenUpdating
Application.DisplayAlerts = SaveDisplayAlerts
Application.EnableEvents = SaveEnableEvents
Exit Sub
End If
On Error GoTo 0
RowNdx = C_START_ROW_FIRST_PAGE
If SplitChar <> vbNullString Then
SplitChar = Left(SplitChar, 1)
End If
If LastRowForInput <= 0 Then
LastRowForInput = WS.Rows.Count
End If
If MaxRowsPerSheet <= 0 Then
MaxRowsPerSheet = Rows.Count
End If
On Error GoTo 0
Do Until EOF(FNum)
Line Input #FNum, InputLine
InputCounter = InputCounter + 1
RowsThisSheet = RowsThisSheet + 1
If C_UPDATE_STATUSBAR_EVERY_N_RECORDS > 0 Then
If InputCounter Mod C_UPDATE_STATUSBAR_EVERY_N_RECORDS = 0 Then
Application.StatusBar = C_STATUSBAR_TEXT & _
Format(InputCounter, "#,##0")
End If
End If
If SplitChar = vbNullString Then
WS.Cells(RowNdx, C_START_COLUMN).Value = InputLine
Else
Arr = Split(expression:=InputLine, delimiter:=SplitChar, limit:=-1, compare:=vbTextCompare)
For Colndx = LBound(Arr) To UBound(Arr)
If Colndx + C_START_COLUMN <= WS.Columns.Count Then
WS.Cells(RowNdx, Colndx + C_START_COLUMN).Value = Arr(Colndx)
Else
TruncatedCount = TruncatedCount + 1
Exit For
End If
Next Colndx
End If ' SplitChar = vbNullString
RowNdx = RowNdx + 1
If (RowNdx > Rows.Count) Or (RowNdx > LastRowForInput) Or (RowsThisSheet > MaxRowsPerSheet) Then
SheetNumber = SheetNumber + 1
If C_TEMPLATE_SHEET_NAME = vbNullString Then
Set WS = ActiveWorkbook.Worksheets.Add(after:=WS)
Else
ActiveWorkbook.Worksheets(C_TEMPLATE_SHEET_NAME).Copy after:=WS
Set WS = ActiveWorkbook.ActiveSheet
End If
On Error Resume Next
WS.Name = C_SHEET_NAME_PREFIX & Format(SheetNumber, "0")
On Error GoTo 0
RowNdx = C_START_ROW_LATER_PAGES
RowsThisSheet = 0
End If
Loop
Close FNum
Application.Calculation = SaveCalc
Application.ScreenUpdating = SaveScreenUpdating
Application.DisplayAlerts = SaveDisplayAlerts
Application.EnableEvents = SaveEnableEvents
Application.StatusBar = False
MsgBox "Import operation from file '" & FName & "' complete." & vbCrLf & _
"Records Imported: " & Format(InputCounter, "#,##0") & vbCrLf & _
"Records Truncated: " & Format(TruncatedCount, "#,##0"), _
vbOKOnly, "Import Text File"
End Sub
Private Function IsFileOpen(FileName As String) As Boolean
Dim FileNum As Integer
Dim ErrNum As Long
Const C_ERR_NO_ERROR = 0&
Const C_ERR_PERMISSION_DENIED = 70&
On Error Resume Next
If FileName = vbNullString Then
IsFileOpen = False
Exit Function
End If
On Error Resume Next
If Dir(FileName, vbNormal + vbArchive + vbSystem + vbHidden) = vbNullString Then
IsFileOpen = False
Exit Function
End If
FileNum = FreeFile() ' Get a free file number.
Err.Clear
Open FileName For Input Lock Read As #FileNum
ErrNum = Err.Number
Close FileNum
Select Case ErrNum
Case C_ERR_NO_ERROR
IsFileOpen = False
Case C_ERR_PERMISSION_DENIED
IsFileOpen = True
Case Else
IsFileOpen = True
End Select
End Function
The input and output procedures in the ImportBigTextFile procedure require that lines be
separated by vbCr or vbCrLf (ASCII 13) characters. If your file uses
vbLf characters (ASCII 10), as is often the case with file created on non-Windows platforms,
you will need to convert the vbLf characters to vbCr characters. The
code below will convert the line break characters.
Function ConvertFileLF(InFileName As String, OutFileName As String) As Boolean
Dim FNum As Integer
Dim S As String
Dim L As Long
If Dir(InFileName) = vbNullString Then
ConvertFileLF = False
Exit Function
End If
L = FileLen(InFileName)
FNum = FreeFile
Open InFileName For Input Access Read As #FNum
S = Input(L, #FNum)
Close #FNum
S = Replace(S, vbLf, vbCr)
FNum = FreeFile
Open OutFileName For Output Access Write As #FNum
Write #FNum, S
Close #FNum
ConvertFileLF = True
End Function
This page last updated: 11-July-2007