Check whether the CSV file is valid or not, this tool also give hint about where the location of the error for easier debugging.
CSV Validator & Linter
What is CSV ?
Comma-separated values (CSV) file is a delimited text file that uses a comma or other delimiter to separate values.
This tool support custom delimiter
This tool also give information about the error if there are any for debugging purpose (linting)
Sample of Valid CSV Text :
cat, dog, horse
house, car, cycle
About
Toolkit Bay or TKB is an online tools website providing free and easy to use tools to increase productivity.
If you have any inquiries or suggestions or issues, you can contact us on:
contact@toolkitbay.com
Data & Privacy
We respect your data. Uploaded file/data/input will be automatically deleted. And the processed data will be deleted less than a day.
More detail on privacy here
Copyright © 2021 Toolkit Bay. All Rights Reserved
I wrote an open source Python tool to simplify validation of such files available from http://pypi.python.org/pypi/cutplace/.
The basic idea is that you describe the data format in a structured interface specification using OpenOffice.org, Excel or plain CSV. This is done in a few minutes and legible enough to serve as documentation too. We use it to validate files with about 200.000 rows on a daily base.
You can validate a CSV file using the command line:
cutplace specification.csv data.csv
In case invalid data rows are found, the exit code is 1. If you need more control, you can write a little Python script that imports the cutplace module and adds a listener for validation events.
As example, here’s a specification that would validate the sample data you provided, filling the gaps of your short description by making a few assumptions. (I’m writing the specification in CSV to inline it in this post. In practice I prefer OpenOffice.org’s Calc and ODS because I can use more formating and make it easier to read and maintain.)
,"Interface: Show statistics"
,
,"Data format"
"D","Format","CSV"
"D","Item delimiter",";"
"D","Header","1"
"D","Encoding","ASCII"
,
,"Fields"
,"Name","Example","Empty","Length","Type","Rule"
"F","date","15-Mar-10",,,"RegEx","\d\d-[A-Z][a-z][a-z]-\d\d"
"F","id","231",,,"Integer","0:"
"F","shown","345",,,"Integer","0:"
,
,"Checks"
,"Description","Type","Rule"
"C","id per date must be unique","IsUnique","date, id"
Lines starting with «D» describe the basic data format. In this case it is a CSV file using «;» as delimiter with 1 header line in ASCII encoding.
Lines starting with «F» describe the various fields. For example,
,"Name","Example","Empty","Length","Type","Rule"
"F","id","231",,,"Integer","0:"
defines a mandatory field «id» of type Integer with a value of 0 or greater. To allow the field to be empty, specify an «X» in the «Empty» column:
,"Name","Example","Empty","Length","Type","Rule"
"F","id","231","X",,"Integer","0:"
Finally there is an optional section to contain more advances checks spawning the whole file, not only single rows. For example, if each date in your file must provide date for an id only once, you can state this using:
,"Description","Type","Rule"
"C","id per date must be unique","IsUnique","date, id"
Any row that starts with an empty column can contain any text you like and will not be processed during validation. This is useful for headings, comments and so on.
Are there any good sites/services to validate consistency of CSV file ?
The same as W3C validator but for CSV ?
Thanks!
asked Jul 18, 2011 at 20:27
Scherbius.comScherbius.com
3,3964 gold badges24 silver badges44 bronze badges
2
The Open Data Institute is developing a CSV validation service that will allow users to check the structure of their data as well as validate it against a simple schema.
The service is still very much in alpha but can be found here:
http://csvlint.io/
The code for the application and the underlying library are both open source:
https://github.com/theodi/csvlint
https://github.com/theodi/csvlint.rb
The README in the library provides a summary of the errors and warnings that can be generated. The following types of error can be reported:
:wrong_content_type
— content type is not text/csv:ragged_rows
— row has a different number of columns (than the first row in the file):blank_rows
— completely empty row, e.g. blank line or a line where all column values are empty:invalid_encoding
— encoding error when parsing row, e.g. because of invalid characters:not_found
— HTTP 404 error when retrieving the data:quoting
— problem with quoting, e.g. missing or stray quote, unclosed quoted field:whitespace
— a quoted column has leading or trailing whitespace
The following types of warning can be reported:
:no_encoding
— the Content-Type header returned in the HTTP request does not have a charset parameter:encoding
— the character set is not UTF-8:no_content_type
— file is being served without a Content-Type header:excel
— no Content-Type header and the file extension is .xls:check_options
— CSV file appears to contain only a single column:inconsistent_values
— inconsistent values in the same column. Reported if <90% of values seem to have same data type (either numeric or alphanumeric including punctuation)
answered Feb 11, 2014 at 17:55
ldoddsldodds
2492 silver badges4 bronze badges
1
The National Archives developed a CSV Schema Language and CSV Validator, software written in Java. It’s open source.
answered Aug 7, 2016 at 12:05
MilosMilos
1923 silver badges11 bronze badges
To validate a CSV file I use the RAINBOW CSV extension in Visual Studio Code and also I open the CSV file in Excel.
answered Feb 15, 2018 at 16:18
mruanovamruanova
6,3516 gold badges37 silver badges55 bronze badges
There is a great way to validate your CSV file.I am referring to this article, where the whole process is explained in tiniest details.
The validation process has two steps: the first one is to post the file to the API. Once your file is accepted,the API returns a polling endpoint that contains the results of the validation process.10 MB limit per file.
answered Feb 5, 2020 at 23:45
monkrusmonkrus
1,47024 silver badges23 bronze badges
CSV Lint at csvlint.com (not .io is a service we’re building to solve this problem. It checks CSV files against user-defined validation rules / schemas cell by cell.
We spent a lot of time tweaking the UI to allow users to create complex validation rules / schemas easily that meet their business needs without a single line of code.
Our offline validation feature allows users to see the results in-realtime even when validating multiple large size (with millions+ rows) files, and most importantly it 100% protects user data privacy.
answered Jun 17, 2018 at 6:57
JoeJoe
2791 gold badge4 silver badges15 bronze badges
1
CSV File Validator
Validation of CSV file against user defined schema (returns back object with data and invalid messages)
Getting csv-file-validator
npm
npm install --save csv-file-validator
yarn
yarn add csv-file-validator --save
Example
import CSVFileValidator from 'csv-file-validator' CSVFileValidator(file, config) .then(csvData => { csvData.data // Array of objects from file csvData.inValidData // Array of error messages }) .catch(err => {})
Please see Demo for more details /demo/index.html
API
CSVFileValidator(file, config)
returns the Promise
file
Type: File
.csv file
config
Type: Object
Config object should contain:
headers — Type: Array
, row header (title) objects
isHeaderNameOptional — Type: Boolean
, skip headers name if it is empty
isColumnIndexAlphabetic — Type: Boolean
, convert numeric column index to alphabetic letter
parserConfig — Type: Object
, optional, papaparse options.
Default options, which can’t be overridden: skipEmptyLines, complete and error
const config = { headers: [], // required isHeaderNameOptional: false, // default (optional) isColumnIndexAlphabetic: false // default (optional) }
name
Type: String
name of the row header (title)
inputName
Type: String
key name which will be return with value in a column
optional
Type: Boolean
Makes column optional. If true column value will be return
headerError
Type: Function
If a header name is omitted or is not the same as in config name headerError function will be called with arguments
headerValue, headerName, rowNumber, columnNumber
required
Type: Boolean
If required is true then a column value will be checked if it is not empty
requiredError
Type: Function
If value is empty requiredError function will be called with arguments
headerName, rowNumber, columnNumber
unique
Type: Boolean
If it is true all header (title) column values will be checked for uniqueness
uniqueError
Type: Function
If one of the header value is not unique uniqueError function will be called with argument headerName, rowNumber
validate
Type: Function
Validate column value. As an argument column value will be passed
For e.g.
/** * @param {String} email * @return {Boolean} */ function(email) { return isEmailValid(email); }
validateError
Type: Function
If validate returns false validateError function will be called with arguments headerName, rowNumber, columnNumber
dependentValidate
Type: Function
Validate column value that depends on other values in other columns.
As an argument column value and row will be passed.
For e.g.
/** * @param {String} email * @param {Array<string>} row * @return {Boolean} */ function(email, row) { return isEmailDependsOnSomeDataInRow(email, row); }
isArray
Type: Boolean
If column contains list of values separated by comma in return object it will be as an array
Config example
const config = { headers: [ { name: 'First Name', inputName: 'firstName', required: true, requiredError: function (headerName, rowNumber, columnNumber) { return `${headerName} is required in the ${rowNumber} row / ${columnNumber} column` } }, { name: 'Last Name', inputName: 'lastName', required: false }, { name: 'Email', inputName: 'email', unique: true, uniqueError: function (headerName) { return `${headerName} is not unique` }, validate: function(email) { return isEmailValid(email) }, validateError: function (headerName, rowNumber, columnNumber) { return `${headerName} is not valid in the ${rowNumber} row / ${columnNumber} column` } }, { name: 'Roles', inputName: 'roles', isArray: true }, { name: 'Country', inputName: 'country', optional: true, dependentValidate: function(email, row) { return isEmailDependsOnSomeDataInRow(email, row); } } ] }
Contributing
Any contributions you make are greatly appreciated.
Please read the Contributions Guidelines before submitting a PR.
License
MIT © Vasyl Stokolosa
Number of input fields different from number of schema fields
Sequence of input fields different from sequence of schema fields
Input fields that are not defined in the schema
Schema fields for which no input fields are defined
Names of input fields with different casing compared to schema fields
Multiple input fields defined with the same name
Multiple input fields mapped to the same schema field