xlsx2html¶
A simple export from xlsx format to html tables with keep cell formatting
Install¶
From pip
pip install xlsx2html
From github dev version
pip install -e "git+https://github.com/Apkawa/xlsx2html.git#egg=xlsx2html"
Compatibly¶
Python |
xlsx2html |
---|---|
2.7 |
0.1.10 |
3.5 |
0.2.1 |
>=3.6 |
latest |
Usage¶
Simple usage¶
from xlsx2html import xlsx2html
out_stream = xlsx2html('path/to/example.xlsx')
out_stream.seek(0)
print(out_stream.read())
pass output file
from xlsx2html import xlsx2html
xlsx2html('path/to/example.xlsx', 'path/to/output.html')
use file like objects
import io
from xlsx2html import xlsx2html
# must be binary mode
xlsx_file = open('path/to/example.xlsx', 'rb')
out_file = io.StringIO()
xlsx2html(xlsx_file, out_file, locale='en')
out_file.seek(0)
result_html = out_file.read()
from shell
python -m xlsx2html path/to/example.xlsx path/to/output.html
Advanced usage¶
Use converter class¶
from xlsx2html import XLSX2HTMLConverter
converter = XLSX2HTMLConverter(
filepath='path/to/example.xlsx',
locale='de_DE',
parse_formula=True,
inline_styles=False
)
html = converter.get_html(sheet="sheet name")
Export sheet to only table¶
from xlsx2html import XLSX2HTMLConverter
converter = XLSX2HTMLConverter(
filepath='path/to/example.xlsx',
locale='de_DE',
parse_formula=True,
inline_styles=False
)
result = converter.get_table(sheet="sheet name", extra_attrs={'id': 'table_id'})
print(f"""
<html>
<head>
<style type="text/css">
{result.css}
</style>
</head>
<body>
{result.html}
</body>
</html>""")
Export all sheets¶
from xlsx2html import XLSX2HTMLConverter
converter = XLSX2HTMLConverter(
filepath='path/to/example.xlsx',
locale='de_DE',
parse_formula=True,
inline_styles=False
)
results = converter.get_tables(extra_attrs={'class': 'xlsx_sheet'})
css_str = '\n'.join([r.css for r in results])
tables_str = '\n'.join([r.html for r in results])
print(f"""
<html>
<head>
<style type="text/css">
{css_str}
</style>
</head>
<body>
{tables_str}
</body>
</html>""")
use openpyxl.Workbook instance¶
import openpyxl
from xlsx2html import xlsx2html
XLSX_FILE = 'path/to/example.xlsx'
# Simple, but no work with parse_formula=True
out_file = xlsx2html(load_workbook(XLSX_FILE, data_only=True))
# Use converter
converter = XLSX2HTMLConverter(
filepath=load_workbook(XLSX_FILE, data_only=True),
parse_formula=True,
formula_wb=load_workbook(XLSX_FILE, data_only=False),
)
out_file = converter.get_html_stream()
Limitations¶
No support:
[ ] overline (no working with openpyxl and LibreOffice with xlsx)
[ ] conditional styles
[ ] charts
[ ] freezing panes
[ ] diagonal borders
[ ] pattern cell fill
[ ] Have issue with
border-collapse: collapse
and merged cells
API Reference¶
Core¶
- xlsx2html.core.xlsx2html(filepath, output=None, locale='en', sheet=None, parse_formula=False, default_cell_border=None, inline_styles=False)[source]¶
- Parameters
filepath (
Union
[BinaryIO
,str
,Workbook
]) – xlsx fileoutput (
Union
[TextIO
,str
,None
]) – to path or file like, defaults to Nonelocale (
str
) –en
orzh_TW
. defaults toen
sheet (
Union
[int
,str
,None
]) – sheet name or idx, defaults to None what means get active sheetparse_formula (
bool
) – If True - enable parse formulas. defaults to Falsedefault_cell_border (
Union
[str
,Dict
[str
,Optional
[str
]],None
]) – default border style. Can use short str like1px solid black
or dict like{'width': '1px', 'style': 'solid', 'color': 'black'}
inline_styles (
bool
) – store styles inline
- Return type
TextIO
- Returns
File like object
- class xlsx2html.core.XLSX2HTMLConverter(filepath, locale='en', parse_formula=False, inline_styles=False, display_grid=False, default_border_style=None, formula_wb=None, parser=None, renderer=None)[source]¶
Bases:
object
- Parameters
filepath (str | BinaryIO | openpyxl.Workbook) – xlsx file
locale (str) –
en
orzh_TW
. defaults toen
parse_formula (
bool
) – If True - enable parse formulas. defaults to Falseformula_fb – If parse_formula set to True and type filepath as openpyxl.Workbook then pass
formula_wb=openpyxl.load_workbook(filepath, data_only=False)
default_border_style (
Union
[str
,Dict
[str
,Optional
[str
]],None
]) – default border style. Can use short str like1px solid black
or dict like{'width': '1px', 'style': 'solid', 'color': 'black'}
inline_styles (
bool
) – store styles inlinedisplay_grid (
bool
) – Show column letters and row numbers. IfXLSX2HTMLConverter.default_border_style
is none - do enabled gray grid
- filepath: Union[BinaryIO, str, openpyxl.workbook.workbook.Workbook]¶
- locale: str = 'en'¶
- parse_formula: bool = False¶
- inline_styles: bool = False¶
- display_grid: bool = False¶
- default_border_style: Optional[Union[str, Dict[str, Optional[str]]]] = None¶
- wb: openpyxl.workbook.workbook.Workbook¶
- formula_wb: Optional[openpyxl.workbook.workbook.Workbook] = None¶
- parser: dataclasses.InitVar = None¶
- renderer: dataclasses.InitVar = None¶
- get_table(sheet=None, extra_attrs=None)[source]¶
- Parameters
sheet (
Union
[int
,str
,None
]) – sheet name or idx, defaults to None what means get active sheetextra_attrs (
Optional
[Dict
[str
,Optional
[str
]]]) – additional attributes for <table> like class or id
- Return type
- Returns
- get_tables(sheets=None, extra_attrs=None)[source]¶
- Parameters
sheets (
Optional
[Iterable
[Union
[int
,str
,None
]]]) – list of sheet name or idx. By defaults get all sheetsextra_attrs (
Optional
[Dict
[str
,Optional
[str
]]]) – additional attributes to <table …> like class or id
- Return type
List
[ConverterTableResult
]- Returns
- class xlsx2html.core.ConverterTableResult(html, css='')[source]¶
Bases:
object
- Parameters
html (
str
) – html of tablecss (
str
) – css contents IfXLSX2HTMLConverter.optimize_styles
set to False then css is empty
- html: str¶
- css: str = ''¶
Parser¶
- class xlsx2html.parser.parser.Column(index, letter, width, hidden=False)[source]¶
Bases:
object
- index: int¶
- letter: str¶
- width: float¶
- class xlsx2html.parser.parser.MergedCellInfo(colspan=None, rowspan=None, cells=<factory>)[source]¶
Bases:
object
- colspan: Optional[int] = None¶
- rowspan: Optional[int] = None¶
- cells: List[openpyxl.cell.cell.Cell]¶
- class xlsx2html.parser.parser.ParserResult(cols, rows, images)[source]¶
Bases:
object
- cols: List[xlsx2html.parser.parser.Column]¶
- rows: List[List[xlsx2html.parser.cell.CellInfo]]¶
- images: Dict[Tuple[int, int], List[xlsx2html.parser.image.ImageInfo]]¶
- class xlsx2html.parser.parser.XLSXParser(wb, locale='en', parse_formula=False, fb=None)[source]¶
Bases:
object
- class xlsx2html.parser.cell.Border(style, color=None)[source]¶
Bases:
object
- style: str¶
- color: Optional[str] = None¶
- class xlsx2html.parser.cell.Borders(top=None, left=None, right=None, bottom=None, diagonal_up=None, diagonal_down=None)[source]¶
Bases:
object
- top: Optional[xlsx2html.parser.cell.Border] = None¶
- left: Optional[xlsx2html.parser.cell.Border] = None¶
- right: Optional[xlsx2html.parser.cell.Border] = None¶
- bottom: Optional[xlsx2html.parser.cell.Border] = None¶
- diagonal_up: Optional[xlsx2html.parser.cell.Border] = None¶
- diagonal_down: Optional[xlsx2html.parser.cell.Border] = None¶
- class xlsx2html.parser.cell.Fill(pattern, color)[source]¶
Bases:
object
- pattern: str¶
- color: Optional[str]¶
- class xlsx2html.parser.cell.Font(size, color=None, italic=False, underline=False, bold=False, strike=False, overline=False, outline=False, shadow=False)[source]¶
Bases:
object
- size: int¶
- color: Optional[str] = None¶
- italic: bool = False¶
- underline: bool = False¶
- bold: bool = False¶
- strike: bool = False¶
- overline: bool = False¶
- outline: bool = False¶
- shadow: bool = False¶
- class xlsx2html.parser.cell.Alignment(horizontal=None, vertical=None, indent=None, text_rotation=None)[source]¶
Bases:
object
- horizontal: Optional[str] = None¶
- vertical: Optional[str] = None¶
- indent: Optional[float] = None¶
- text_rotation: Optional[int] = None¶
- class xlsx2html.parser.cell.CellInfo(id, column, column_letter, row, coordinate, value, formatted_value, alignment, colspan=None, rowspan=None, height=19, border=None, fill=None, font=None)[source]¶
Bases:
object
- id: str¶
- column: int¶
- column_letter: str¶
- row: int¶
- coordinate: str¶
- value: Any¶
- formatted_value: str¶
- alignment: xlsx2html.parser.cell.Alignment¶
- colspan: Optional[int] = None¶
- rowspan: Optional[int] = None¶
- height: int = 19¶
- border: Optional[xlsx2html.parser.cell.Borders] = None¶
- fill: Optional[xlsx2html.parser.cell.Fill] = None¶
- font: Optional[xlsx2html.parser.cell.Font] = None¶
Renderer¶
Formatters¶
- class xlsx2html.format.number.ColorNumberPattern(*args, **kwargs)[source]¶
Bases:
babel.numbers.NumberPattern
- apply(value, locale, **kwargs)[source]¶
Renders into a string a number following the defined pattern.
Forced decimal quantization is active by default so we’ll produce a number string that is strictly following CLDR pattern definitions.
- Parameters
value (decimal.Decimal|float|int) – The value to format. If this is not a Decimal object, it will be cast to one.
locale (str|babel.core.Locale) – The locale to use for formatting.
currency (str|None) – Which currency, if any, to format as.
currency_digits (bool) – Whether or not to use the currency’s precision. If false, the pattern’s precision is used.
decimal_quantization (bool) – Whether decimal numbers should be forcibly quantized to produce a formatted output strictly matching the CLDR definition for the locale.
force_frac – DEPRECATED - a forced override for self.frac_prec for a single formatting invocation.
- Returns
Formatted decimal string.
- Return type
str
- xlsx2html.format.number.format_decimal(number, format=None, locale='en_US_POSIX')[source]¶
Return the given decimal number formatted for a specific locale.
>>> format_decimal(1.2345, locale='en_US') '1.234' >>> format_decimal(1.2346, locale='en_US') '1.235' >>> format_decimal(-1.2346, locale='en_US') '-1.235' >>> format_decimal(1.2345, locale='sv_SE') '1,234' >>> format_decimal(1.2345, locale='de') '1,234'
The appropriate thousands grouping and the decimal separator are used for each locale:
>>> format_decimal(12345.5, locale='en_US') '12,345.5'
- Parameters
number (
Any
) – the number to formatformat (
Optional
[str
]) –locale (
str
) – the Locale object or locale identifier
- Return type
str
- xlsx2html.format.dt.format_datetime(datetime, fmt, locale='en_US_POSIX', tzinfo=None)[source]¶
- Return type
str
- xlsx2html.format.dt.format_time(time, fmt, locale='en_US_POSIX', tzinfo=None, date=None)[source]¶
- Return type
str
- xlsx2html.format.locale.parse_locale_code(code)[source]¶
>>> parse_locale_code('-404') 'zh_Hant_TW' >>> parse_locale_code('404') 'zh_Hant_TW' >>> parse_locale_code('0404') 'zh_Hant_TW' >>> parse_locale_code('58050')
- Return type
Optional
[str
]
- xlsx2html.format.locale.extract_locale_from_format(fmt)[source]¶
>>> extract_locale_from_format('[$-404]e/m/d') ('zh_Hant_TW', 'e/m/d') >>> extract_locale_from_format('[$USD-404]e/m/d') ('zh_Hant_TW', 'e/m/d') >>> extract_locale_from_format('[$$-404]#.00') ('zh_Hant_TW', '#.00') >>> extract_locale_from_format('[RED]e/m/d') (None, '[RED]e/m/d')
- Return type
Tuple
[Optional
[str
],str
]
- class xlsx2html.format.hyperlink.Hyperlink(title, location=None, target=None, href=None)[source]¶
Bases:
object
- title: str¶
- location: Optional[str] = None¶
- target: Optional[str] = None¶
- href: Optional[str] = None¶
- xlsx2html.format.hyperlink.resolve_hyperlink_formula(cell, f_cell=None)[source]¶
- Return type
Optional
[Hyperlink
]
Contributors¶
Run tests¶
pip install -r requirements.txt
pytest # run tests
tox # run test matrix
Run tests with pyenv with specific python and pypy¶
pyenv install 3.10-dev pypy3.7-7.3.5
pyenv local 3.10-dev pypy3.7-7.3.5
pip install -r requirements.txt
tox -e py310,pypy3
Type checks¶
tox -e type
Lint code¶
tox -e qa
Before commit¶
Install git hook
pip install -r requirements.txt
pre-commit install
For pycharm needs install tox
to global
Docs¶
pip install -r requirements.txt
cd docs
make html