본문 바로가기

Python

Python_EX-005. 10 way data read from excel file

No. Method Code Example Pros Cons
1 pandas  # to read an Excel file
pimport pandas as pd

df = pd.read_excel('filename.xlsx')
print(df)

- Simple & efficient - Requires
pandas library
2 openpyxl # to read an Excel file
from openpyxl import load_workbook

wb = load_workbook('filename.xlsx')
sheet = wb.active
for row in sheet.iter_rows(values_only=True):
    print(row)

- Handles .xlsx files well - Slightly more complex
3 xlrd  # to read an Excel file
import xlrd

workbook = xlrd.open_workbook('filename.xlsx')
sheet = workbook.sheet_by_index(0)
for row in range(sheet.nrows):
    print(sheet.row_values(row))

- Good for reading  
older .xls files
- xlrd no longer
  supports .xlsx files
4 pyxlsb  # to read a binary Excel file
import pandas as pd
from pyxlsb import open_workbook

with open_workbook('filename.xlsb') as wb:
    with wb.get_sheet(1) as sheet:
        for row in sheet.rows():
            print([item.v for item in row])

- Efficient for binary files - Requires pyxlsb library
5 pandas # pandas with xlrd engine
import pandas as pd

df = pd.read_excel('filename.xls', 
            engine='xlrd')
print(df)

- Compatibility with older Excel formats - Requires xlrd library
6 pandas # pandas with openpyxl
import pandas as pd

df = pd.read_excel('filename.xlsx', 
            engine='openpyxl')
print(df)
- Handles .xlsx files well - Requires
openpyxl library
7 pandas # pandas with specific sheet
import pandas as pd

df = pd.read_excel('filename.xlsx', 
            sheet_name='Sheet1')
print(df)
- Ability to read
specific sheets
- None
8 pandas # Pandas with multiplesheets
import pandas as pd

sheets = pd.read_excel('filename.xlsx', 
                    sheet_name=None)
for name, sheet in sheets.items():
    print(f'Sheet name: {name}')
    print(sheet)
- Reads all sheets
in one go
- Can be memory-intensive with large workbooks
9 openpyxl  #to read specific cells
from openpyxl import load_workbook

wb = load_workbook('filename.xlsx')
sheet = wb['Sheet1']
cell = sheet['A1']
print(cell.value)
- Fine-grained control 
over cell access
- Requires
openpyxl library
10 pandas  # to read specific columns
import pandas as pd

df = pd.read_excel('filename.xlsx', 
            usecols=['Column1', 'Column2'])
print(df)
- Reads only 
required columns
- Requires pandas library