How to read huge Excel file using C#

Reading a Excel file is easy using the SpreadsheetDocument and SheetData classes but this method uses a lot of memory if the Excel file is big.
This is due to the fact that to get the sheet data the framework loads the entire file into memory.

To read huge files it is better to use the OpenXmlReader.

Here’s an example usage of the OpenXmlReader (it also consider the SharedString)

public static IEnumerable> ReadData(string fileName)
{
    using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false)) {
        WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
        WorksheetPart worksheetPart = workbookPart.WorksheetParts.FirstOrDefault();
        if (workbookPart != null) {
            using (OpenXmlReader oxr = OpenXmlReader.Create(worksheetPart)) {
                IEnumerable sharedStrings = workbookPart.SharedStringTablePart.SharedStringTable.Elements();
                while (oxr.Read()) {
                    if (oxr.ElementType == typeof(Row)) {
                        oxr.ReadFirstChild();

                        List rowData = new List();
                        do {
                            if (oxr.ElementType == typeof(Cell)) {
                                Cell c = (Cell)oxr.LoadCurrentElement();
                                string cellValue;
                                if (c.DataType != null && c.DataType == CellValues.SharedString) {
                                    SharedStringItem ssi = sharedStrings.ElementAt(int.Parse(c.CellValue.InnerText));
                                    cellValue = ssi.Text.Text;
                                } else {
                                    cellValue = c.CellValue.InnerText;
                                }

                                rowData.Add(cellValue);
                            }
                        } while (oxr.ReadNextSibling());

                        yield return rowData;
                    }
                }
            }
        }
    }
} // GetData

To loop the Excel’s rows

foreach (List rowData in ReadData(fileName)) {
    ...
}

The OpenXmlReader uses less memory, but it is slow.
For faster reading you can use the ExcelDataReader, it is available also as a NuGet package.