XLSX to CSV Converter
Convert Excel XLSX files to CSV format entirely in your browser. Drag and drop your file, preview the data, select sheets, customize the delimiter and encoding, then download the CSV. Your data never leaves your device.
XLSX vs CSV Format Differences
XLSX and CSV are both used to store tabular data, but they differ fundamentally in structure, capability, and file size. Understanding these differences is essential for choosing the right format for a given task and for anticipating what changes when you convert between them.
XLSX is the modern Excel file format introduced with Microsoft Office 2007. It is technically a ZIP archive containing XML files that describe cell values, formulas, formatting, charts, images, pivot tables, macros (in .xlsm variants), and metadata. A single XLSX file can contain multiple worksheets, each with its own formatting rules and data types. The format supports over 1 million rows and 16,384 columns per sheet, along with rich features like data validation, conditional formatting, and named ranges.
CSV, by contrast, stands for Comma-Separated Values. It is a plain text format where each line represents a row of data and fields within each row are separated by a delimiter, typically a comma. CSV files have no concept of worksheets, cell formatting, data types, formulas, or any structural metadata. Every value is stored as a text string. A CSV file is readable by any text editor, any programming language, and virtually every data processing tool in existence.
| Feature | XLSX | CSV |
|---|---|---|
| File structure | ZIP archive with XML | Plain text |
| Multiple sheets | Yes | No |
| Cell formatting | Full (fonts, colors, borders) | None |
| Formulas | Yes | No (values only) |
| Data types | Number, text, date, boolean | Text only |
| Charts and images | Yes | No |
| Max file size (typical) | Larger (compressed XML) | Smaller (raw text) |
| Universal compatibility | Requires Excel or compatible app | Opens in any text editor |
| Database import | Requires adapter | Direct import |
| Version control (Git) | Binary diffs only | Line-by-line diffs |
When to Use CSV
CSV is the preferred format in several common scenarios. Database administrators use CSV for bulk imports and exports because every major database system (MySQL, PostgreSQL, SQL Server, SQLite, MongoDB) supports CSV natively. Data scientists working with Python pandas, R, or Julia reach for CSV as the default interchange format because loading a CSV file requires a single function call with no special libraries.
Web applications that accept data uploads almost universally support CSV. When you export contacts from a CRM, download transaction history from a bank, or import products into an e-commerce platform, the format is nearly always CSV. Its simplicity means that the receiving system does not need to parse complex XML structures or handle Excel-specific quirks.
Version control is another strong argument for CSV. If you track data files in Git, CSV files produce meaningful line-by-line diffs that show exactly which rows changed. XLSX files, being binary archives, show only that the file changed without indicating what specifically was modified. Teams collaborating on data files through Git or other version control systems strongly prefer CSV for this reason.
API integrations commonly use CSV alongside JSON and XML. Many APIs offer CSV as an export option because it requires the least parsing overhead on the client side. Streaming large datasets row by row is trivial with CSV because each line is an independent record, whereas JSON requires tracking nested brackets and XLSX requires decompressing an archive.
Data Loss Considerations When Converting
Converting from XLSX to CSV is inherently a lossy process. The XLSX format carries significantly more information than CSV can represent, and all of that additional information is discarded during conversion. Being aware of what you lose helps you make informed decisions about when conversion is appropriate.
Formatting Loss
All visual formatting is lost. Cell background colors, font styles (bold, italic, underline), font sizes, text alignment, cell borders, and number format masks disappear entirely. A number formatted as currency ($1,234.56) in Excel becomes the raw number 1234.56 in CSV. Dates formatted as "March 27, 2026" may convert to their underlying serial number or to a different date format depending on the parsing library.
Formula Loss
Formulas are replaced by their last calculated values. A cell containing =VLOOKUP(A2, Sheet2!A:B, 2, FALSE) becomes whatever value the VLOOKUP returned. The formula logic, the reference to Sheet2, and the dynamic calculation capability are all gone. If the source spreadsheet contains errors (#REF!, #N/A, #VALUE!), these error codes appear as text strings in the CSV.
Merged Cells
Merged cells are split back into individual cells. Only the top-left cell of a merged range retains the value; all other cells in the range become empty. This can create confusing gaps in the CSV output if you are not expecting it.
Data Types
CSV does not distinguish between numbers, dates, booleans, and text. A number like 007 stored as text in Excel (with a leading-zero format) may become 7 in CSV if the parser treats it as a number. Dates stored as serial numbers may appear as integers rather than human-readable dates. Boolean TRUE/FALSE values become text strings.
Character Encoding Issues
Character encoding determines how characters are represented as bytes in a file. Choosing the wrong encoding can corrupt characters, particularly accented letters, Asian scripts, and special symbols.
UTF-8
UTF-8 is the most widely supported encoding and the default recommendation for almost every use case. It can represent every Unicode character, including Latin accents (e, u, a), Chinese characters, Arabic script, emoji, and mathematical symbols. UTF-8 is backward-compatible with ASCII, meaning that files containing only basic English characters are identical whether saved as UTF-8 or ASCII.
ASCII
ASCII encodes only 128 characters: English letters, digits, basic punctuation, and control characters. Any character outside this range will be lost or replaced with a placeholder during conversion. Use ASCII only when you are certain your data contains no international characters and the receiving system explicitly requires it.
Latin-1 (ISO-8859-1)
Latin-1 extends ASCII with 128 additional characters covering Western European languages (French, German, Spanish, Portuguese, Italian). It cannot represent Eastern European, Asian, or other non-Western scripts. Legacy systems in Western Europe sometimes require Latin-1 encoding, but for new projects, UTF-8 is always the better choice.
BOM (Byte Order Mark)
Some applications, particularly older versions of Excel, expect a UTF-8 BOM at the beginning of CSV files to correctly detect the encoding. The BOM is a three-byte sequence (EF BB BF) that signals UTF-8 encoding. This converter adds a UTF-8 BOM by default when UTF-8 encoding is selected, ensuring maximum compatibility with Excel and other spreadsheet applications.
Why Excel Formulas Are Not Preserved
Excel formulas exist within a computational model that CSV simply cannot represent. A formula like =IF(AND(A1>10, B1<5), "High", "Low") depends on the spreadsheet engine's ability to evaluate logical functions, reference other cells, and return computed results. CSV is a static data format with no computation layer.
When you convert XLSX to CSV, every formula cell is replaced by its most recently calculated value. This value was computed by Excel (or whichever application last saved the file) and is stored alongside the formula in the XLSX file. The SheetJS library used by this tool reads these cached values and writes them to the CSV output.
This behavior has important implications. If you modified cell values that feed into formulas but did not recalculate the spreadsheet (by pressing F9 or saving in Excel), the cached formula results may be stale. Always ensure your spreadsheet is fully calculated before converting to CSV. In Excel, you can force recalculation by pressing Ctrl+Alt+F9.
Some formulas produce different results depending on context. Volatile functions like NOW(), TODAY(), and RAND() return different values each time they recalculate. The CSV will contain whatever value was cached at the moment of the last save, which may differ from the current date or a new random number.
Handling Large Files
This converter processes files entirely in browser memory using JavaScript. For small to medium files (under 10 MB), the conversion is nearly instantaneous. For larger files, processing time increases proportionally with file size and the number of cells.
Files between 10 MB and 50 MB are supported but may take several seconds to parse. During this time, the browser tab may appear briefly unresponsive as the JavaScript engine processes the data. The preview will show the first 20 rows regardless of file size, allowing you to verify the conversion before downloading.
For files exceeding 50 MB, you may encounter browser memory limitations. In this case, consider splitting the spreadsheet into smaller files in Excel before converting, or using a command-line tool like LibreOffice's headless mode or Python's openpyxl library for batch processing.
Optimization Strategies
Several strategies can reduce the effective size of your conversion task. Delete unused rows and columns before converting. Excel files often contain formatting or data in cells far beyond the visible data range, inflating file size unnecessarily. Clear any cells below and to the right of your actual data, then save and re-upload.
If you only need specific columns, consider using Excel's filter or column-hiding features to identify the relevant data, then copy it to a new sheet before converting. This reduces both the parsing workload and the resulting CSV file size.
CSV Standards and RFC 4180
Despite its apparent simplicity, CSV has historically lacked a formal specification, leading to inconsistent implementations across different tools. RFC 4180, published in 2005, provides a common definition that most modern tools follow.
According to RFC 4180, each record is located on a separate line, delimited by a line break (CRLF). The last record in the file may or may not end with a line break. An optional header line may appear as the first line with the same format as normal records. Fields are separated by commas. Fields containing commas, double quotes, or line breaks must be enclosed in double quotes. A double quote appearing inside a quoted field must be escaped by preceding it with another double quote.
This converter follows RFC 4180 conventions by default. Fields containing the selected delimiter, double quotes, or newline characters are automatically enclosed in double quotes. Internal double quotes are escaped by doubling them. The default line ending is CRLF as specified by the RFC, with an option to switch to LF for Unix and macOS environments.
Automating XLSX to CSV Conversion
While this browser-based tool is ideal for occasional conversions, workflows that require frequent or batch conversion benefit from automation. Several approaches exist depending on your technical environment.
Python with openpyxl or pandas
Python is the most popular language for data processing automation. The pandas library reads XLSX files with a single function call (pd.read_excel) and writes CSV with another (df.to_csv). For large files, the openpyxl library provides read-only mode that processes rows iteratively without loading the entire file into memory.
LibreOffice Command Line
LibreOffice can convert files in headless mode without a graphical interface. The command "libreoffice --headless --convert-to csv file.xlsx" processes the file and outputs a CSV. This approach integrates well with shell scripts and cron jobs for scheduled batch conversions.
Node.js with SheetJS
The same SheetJS library that powers this browser-based tool is available as an npm package (xlsx). A Node.js script can read XLSX files, iterate through sheets, and write CSV files programmatically. This is particularly useful for server-side conversion in web applications or API endpoints.
Power Automate and Zapier
No-code automation platforms like Microsoft Power Automate and Zapier can trigger XLSX to CSV conversions automatically. For example, you can create a flow that monitors a SharePoint folder for new XLSX files and automatically converts them to CSV, saving the result to another folder or uploading it to a database.
Common Problems and Solutions
Garbled Characters After Conversion
If you see garbled characters (mojibake) after opening the CSV, the encoding does not match what the receiving application expects. Try re-converting with UTF-8 encoding, which is the most universally compatible option. If opening in Excel, try importing via Data > From Text/CSV rather than double-clicking the file, which allows you to specify the encoding manually.
Dates Appearing as Numbers
Excel stores dates internally as serial numbers (days since January 1, 1900). If dates appear as five-digit numbers like 46108, the converter is outputting the raw serial value instead of the formatted date. This tool is configured to output formatted date strings, but if you encounter this issue, re-save the Excel file ensuring dates are formatted, then re-upload.
Leading Zeros Stripped
Zip codes, phone numbers, and product codes with leading zeros (like 00123) may lose those zeros because CSV parsers often treat all-digit fields as numbers. To preserve leading zeros, ensure the cells in Excel are formatted as text before converting. In the CSV output, you may need to enclose such fields in quotes when importing into other applications.
Comma Inside Fields
If your data contains commas (such as addresses or descriptions), use this tool's automatic quoting feature, which wraps any field containing the delimiter in double quotes. Alternatively, switch to a semicolon or tab delimiter to avoid conflicts with commas in the data itself.